Back to Search Start Over

Deep convolutional neural networks with genetic algorithm-based synthetic minority over-sampling technique for improved imbalanced data classification.

Authors :
Alex, Suja A.
Jesu Vedha Nayahi, J.
Kaddoura, Sanaa
Source :
Applied Soft Computing; May2024, Vol. 156, pN.PAG-N.PAG, 1p
Publication Year :
2024

Abstract

Imbalanced data classification presents a challenge in machine learning, inducing biased model learning. Moreover, data dimensionality poses another challenge as it highly impacts classifier performance. This paper proposes a new deep-learning method that combines feature selection with oversampling to address these challenges. The proposed approach, GA-SMOTE-DCNN, integrates a genetic algorithm (GA) for feature selection, SMOTE for oversampling, and a deep 1D-convolutional neural network (DCNN) for classification. This study reveals that pre-splitting the data into training and testing sets before applying SMOTE results in higher accuracy, showing an improvement in accuracy ranging between 1.94% and 3.98% compared to post-SMOTE splitting for each dataset. This method achieved accuracy rates of 86.81% for the Balance Scale dataset, 86.15% for the Oil Spill dataset, 89.21% for the Yeast dataset, 91.32% for the Mammography dataset, 88.23% for the Australian credit dataset, and 89.53% for the German Credit dataset when compared with benchmark methods, underscoring its significance in tackling high-dimensional and imbalanced data classification problems. This method demonstrates scalability in effectively addressing challenges associated with high-dimensional and imbalanced data classification across various domains. [Display omitted] • Propose a novel GA-SMOTE-DCNN technique to solve biased prediction and overfitting due to data dimensionality and imbalance. • Test the proposed model on six datasets related to different domains and compare it to models proposed in the literature. • Compare feature selection method with other filter and wrapper methods to prove effectiveness in enhancing classification. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
15684946
Volume :
156
Database :
Supplemental Index
Journal :
Applied Soft Computing
Publication Type :
Academic Journal
Accession number :
176357889
Full Text :
https://doi.org/10.1016/j.asoc.2024.111491