Back to Search Start Over

A Comprehensive Investigation of the Performances of Different Machine Learning Classifiers with SMOTE-ENN Oversampling Technique and Hyperparameter Optimization for Imbalanced Heart Failure Dataset.

Authors :
Muntasir Nishat, Mirza
Faisal, Fahim
Jahan Ratul, Ishrak
Al-Monsur, Abdullah
Ar-Rafi, Abrar Mohammad
Nasrullah, Sarker Mohammad
Reza, Md Taslim
Khan, Md Rezaul Hoque
Source :
Scientific Programming. 3/9/2022, p1-17. 17p.
Publication Year :
2022

Abstract

Heart failure is a chronic cardiac condition characterized by reduced supply of blood to the body due to impaired contractile properties of the muscles of the heart. Like any other cardiac disorder, heart failure is a serious ailment limiting the activities and curtailing the lifespan of the patient, most often resulting in death sooner or later. Detection of survival of patients with heart failure is the path to effective intervention and good prognosis in terms of both treatment and quality of life of the patient. Machine learning techniques can be critical in this regard since they can be used to predict the survival of patients with heart failure in advance, allowing patients to receive appropriate treatment. Hence, six supervised machine learning algorithms have been studied and applied to analyze a dataset of 299 individuals from the UCI Machine Learning Repository and predict their survivability from heart failure. Three distinct approaches have been followed using Decision Tree Classifier, Logistic Regression, Gaussian Naïve Bayes, Random Forest Classifier, K-Nearest Neighbors, and Support Vector Machine algorithms. Data scaling has been performed as a preprocessing step utilizing the standard and min–max scaling method. However, grid search cross-validation and random search cross-validation techniques have been employed to optimize the hyperparameters. Additionally, the synthetic minority oversampling technique and edited nearest neighbor (SMOTE-ENN) data resampling technique are utilized, and the performances of all the approaches have been compared extensively. The experimental results clearly indicate that Random Forest Classifier (RFC) surpasses all other approaches with a test accuracy of 90% when used in combination with SMOTE-ENN and standard scaling technique. Therefore, this comprehensive investigation portrays a vivid visualization of the applicability and compatibility of different machine learning algorithms in such an imbalanced dataset and presents the role of the SMOTE-ENN algorithm and hyperparameter optimization for enhancing the performances of the machine learning algorithms. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10589244
Database :
Academic Search Index
Journal :
Scientific Programming
Publication Type :
Academic Journal
Accession number :
155651534
Full Text :
https://doi.org/10.1155/2022/3649406