Back to Search
Start Over
Feature Selection and Hybrid Sampling with Machine Learning Methods for Health Data Classification.
- Source :
- Revue d'Intelligence Artificielle; Aug2024, Vol. 38 Issue 4, p1255-1261, 7p
- Publication Year :
- 2024
-
Abstract
- This study aims to improve the performance of classification algorithms in dealing with unbalanced and high-dimensional health in stroke prediction by integrating correlation feature selection and hybrid sampling techniques. Several previous studies that used machine learning methods to predict stroke still had less than optimal accuracy. This is because stroke data has several problems, including missing values, many attributes, and data imbalance can cause a decrease in the performance of the classification method. Therefore, this research uses an integrated approach to feature selection and hybrid sampling. The objective of the feature selection technique is to identify important attributes within stroke data. After that, the SMOTE-Enn hybrid sampling approach is utilized to address data imbalance. The research findings indicate that employing correlation-based feature selection along with SMOTE-Enn and the Random Forest algorithm leads to improved performance compared to no sampling with the SVM and XGBoost methods, with an increase in accuracy of 3%, recall of 91.3%, and AUC of 45.2%. Thus, the proposed method performed better than recent stroke classification studies. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 0992499X
- Volume :
- 38
- Issue :
- 4
- Database :
- Complementary Index
- Journal :
- Revue d'Intelligence Artificielle
- Publication Type :
- Academic Journal
- Accession number :
- 179446580
- Full Text :
- https://doi.org/10.18280/ria.380419