Back to Search Start Over

Noise-free sampling with majority framework for an imbalanced classification problem.

Authors :
Firdausanti, Neni Alya
Mendonça, Israel
Aritsugi, Masayoshi
Source :
Knowledge & Information Systems; Jul2024, Vol. 66 Issue 7, p4011-4042, 32p
Publication Year :
2024

Abstract

Class imbalance has been widely accepted as a significant factor that negatively impacts a machine learning classifier's performance. One of the techniques to avoid this problem is to balance the data distribution by using sampling-based approaches, in which synthetic data is generated using the probability distribution of the classes. However, this process is sensitive to the presence of noise in the data, and the boundaries between the majority class and the minority class are blurred. Such phenomena shift the algorithm's decision boundary away from the ideal outcome. In this work, we propose a hybrid framework for two primary objectives. The first objective is to address class distribution imbalance by synthetically increasing the data of a minority class, and the second objective is, to devise an efficient noise reduction technique that improves the class balance algorithm. The proposed framework focuses on removing noisy elements from the majority class, and by doing so, provides more accurate information to the subsequent synthetic data generator algorithm. To evaluate the effectiveness of our framework, we employ the geometric mean (G-mean) as the evaluation metric. The experimental results show that our framework is capable of improving the prediction G-mean for eight classifiers across eleven datasets. The range of improvements varies from 7.78% on the Loan dataset to 67.45% on the Abalone19_vs_10-11-12-13 dataset. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02191377
Volume :
66
Issue :
7
Database :
Complementary Index
Journal :
Knowledge & Information Systems
Publication Type :
Academic Journal
Accession number :
178029260
Full Text :
https://doi.org/10.1007/s10115-024-02079-6