Back to Search Start Over

A new boundary-degree-based oversampling method for imbalanced data.

Authors :
Chen, Yueqi
Pedrycz, Witold
Yang, Jie
Source :
Applied Intelligence; Nov2023, Vol. 53 Issue 22, p26518-26541, 24p
Publication Year :
2023

Abstract

Imbalanced data constitute a significant challenge in practical applications, as standard classifiers are usually designed to work on data with balanced class label distributions. One of effective methods to solve the imbalanced problem is boundary oversampling method, which only focuses on the classification of boundary samples. However, most boundary oversampling methods roughly select boundary samples for oversampling without considering the potentially useful boundary characteristics inherent in majority (negative) class. To overcome this limitation, we propose a novel boundary-degree-based oversampling method (BDO) in this paper. The originality of BDO stemps from quantifying the degree to which each negative sample can be regarded as a boundary sample in terms of probability using information entropy. Applying the sigma rule on the quantified boundary degree, negative boundary samples are determined to indirectly select minority (positive) boundary samples for oversampling. In this way, a substantial amount of information hidden in the negative class can be mined. To further transfer the mined information to help oversample, BDO iteratively synthesizes aided boundary points along a fraudulent gradient. Oversampling finally is performed on both positive boundary samples and the aided boundary points. Experimental results completed on 15 benchmark imbalanced datasets, two multi-label datasets and one large-scale dataset in terms of G-mean, F-measure, AUC, accuracy, TPR and TNR show that BDO exhibits better performance, which is competitive with some commonly considered methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0924669X
Volume :
53
Issue :
22
Database :
Complementary Index
Journal :
Applied Intelligence
Publication Type :
Academic Journal
Accession number :
173178593
Full Text :
https://doi.org/10.1007/s10489-023-04846-4