Self-tuning framework to reduce the number of false positive instances using aggregation functions in ensemble classifier.

Authors :: Gałka, Wojciech
Bazan, Jan G.
Bentkowska, Urszula
Mrukowicz, Marcin
Drygaś, Paweł
Ochab, Marcin
Suszalski, Piotr
Obara, Sebastian
Source :: Procedia Computer Science; 2024, Vol. 246, p4028-4037, 10p
Publication Year :: 2024
Abstract: In this contribution, the model which is dedicated to reducing the number of false positive instances is proposed. This is a self-tuning model using aggregation functions and time-series data periods. As a case study, the proposed model is tested in the context of phishing link detection. In the proposed model, well-known aggregation functions are applied to combine the confidence values of multiple Classification models for email phishing. The division of the dataset into multiple segments and subsets facilitates the implementation of incremental learning strategies. This approach enables the iterative enhancement of model performance through the training of new data while leveraging previously acquired knowledge. In our research, two datasets are considered, namely the existing PhiUSIIL phishing URL dataset as well as the dataset provided by the FreshMail company are applied. The proposed algorithm achieves a small number of expected false positives. This reduces the costs associated with manual analysis of such cases by domain experts (in the case of incorrect prediction as phishing mail). [ABSTRACT FROM AUTHOR]

Subjects :: MACHINE learning
PHISHING
LEARNING strategies
ALGORITHMS
POSTAL service

Full Text Access

Tools