Back to Search Start Over

Accuracy and diversity-aware multi-objective approach for random forest construction

Authors :
Nour El Islem Karabadji
Abdelaziz Amara Korba
Ali Assi
Hassina Seridi
Sabeur Aridhi
Wajdi Dhifli
Laboratoire de Gestion Electronique de Document [Annaba] (LabGED)
Université Badji Mokhtar Annaba (UBMA)
Laboratoire De Technologies Des Systemes Energetiques (LTSE), E3360100, Annaba, P.O. Box 218, 23000, Algeria
Laboratoire Informatique, Image et Interaction - EA 2118 (L3I)
La Rochelle Université (ULR)
House of Commons, 181 Queen Street, Ottawa, Ontario K1A 0A6, Canada
Computational Algorithms for Protein Structures and Interactions (CAPSID)
Inria Nancy - Grand Est
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Complex Systems, Artificial Intelligence & Robotics (LORIA - AIS)
Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Evaluation des technologies de santé et des pratiques médicales - ULR 2694 (METRICS)
Université de Lille-Centre Hospitalier Régional Universitaire [Lille] (CHRU Lille)
Source :
Expert Systems with Applications, Expert Systems with Applications, 2023, 225 (1), pp.120138. ⟨10.1016/j.eswa.2023.120138⟩
Publication Year :
2023
Publisher :
HAL CCSD, 2023.

Abstract

International audience; Random Forest is an ensemble classification approach. It aims to design a discrete finite group of decision trees constructed based on bootstrap samples and random attribute selection. Random Forests have strong generalization capacities due to the variance in the training and attribute couple subsets used for constructing different decision trees in the forest. However, to construct a robust and effective random forest, two main issues need to be taken into account namely: (1) increasing the accuracy and diversity of decision trees; (2) decreasing the number of decision trees. In this paper, a genetic algorithm-based approach to tackle the aforementioned challenges related to random forest construction is proposed. Three objectives are taken into consideration. First, strengthening the classification accuracy of individual decision trees as well as that of the forest. Second, making use of diversity measures among the decision trees to improve the generalization of the constructed model. Third, minimizing the number of trees in the forest and finding an optimal subset of the random forest. An experimental evaluation on several datasets from the UCI Machine Learning Repository is conducted. The obtained results show that the proposed approach outperforms state-of-the-art classical as well as evolutionary random forest construction methods. Finally, the proposed approach is used to build a reliable random forest model for detecting Botnet traffic in Internet of Things environment.

Details

Language :
English
ISSN :
09574174
Database :
OpenAIRE
Journal :
Expert Systems with Applications, Expert Systems with Applications, 2023, 225 (1), pp.120138. ⟨10.1016/j.eswa.2023.120138⟩
Accession number :
edsair.doi.dedup.....4e24cc1844ab4eb02ddd8d35af1f712c
Full Text :
https://doi.org/10.1016/j.eswa.2023.120138⟩