Back to Search Start Over

Imbalance factor: a simple new scale for measuring inter-class imbalance extent in classification problems.

Authors :
Pirizadeh, Mohsen
Farahani, Hadi
Kheradpisheh, Saeed Reza
Source :
Knowledge & Information Systems; Oct2023, Vol. 65 Issue 10, p4157-4183, 27p
Publication Year :
2023

Abstract

Learning from datasets that suffer from differences in absolute frequency of classes is one of the most challenging tasks in the machine learning field. Efforts have been made to tackle the problem of class imbalance by providing solutions at data and algorithmic levels. In these cases, in order to categorize the solutions according to problem class imbalance level and to obtain meaningful and consistent interpretations from the experiments, it is essential to be able to quantify the extent of dataset imbalance. A competent scale to summarize the severity of data inter-class imbalance, requires to meet at least the following three conditions: (1) the ability to calculate the imbalance extent for both binary and multi-class datasets, (2) output within a definite and fixed range of values, (3) being correlated with the performance of different classifiers. Nevertheless, none of the scales introduced so far satisfy all the enumerated requirements. In this study, we propose an informative scale called imbalance factor (IF) based on information theory, which, independent of the number of data classes, quantifies dataset imbalance extent in a single value in the range of [0, 1]. Besides, IF offers various limiting cases with different growth rates according to its α order. This property is critical as it can settle the possibility of having the same extent for distinct distributions. Eventually, empirical experiments indicate that with an average correlation of 0.766 with the classification accuracies over 15 real datasets, IF is remarkably more sensitive to class imbalance changes than other previous scales. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02191377
Volume :
65
Issue :
10
Database :
Complementary Index
Journal :
Knowledge & Information Systems
Publication Type :
Academic Journal
Accession number :
170063049
Full Text :
https://doi.org/10.1007/s10115-023-01881-y