Back to Search Start Over

Hyperparameter optimization: a comparative machine learning model analysis for enhanced heart disease prediction accuracy.

Authors :
Rimal, Yagyanath
Sharma, Navneet
Source :
Multimedia Tools & Applications; May2024, Vol. 83 Issue 18, p55091-55107, 17p
Publication Year :
2024

Abstract

An optimizer is the process of hyperparameter tuning that updates the machine learning model after each step of weight loss adjustment of input features. The permutation and combination of high and low learning rates with various step sizes ultimately leads to an optimal tuning model. The step size and learning rate sometimes take much smaller steps, allowing the derivatives of tangent to gradually reach global minima. The primary goal of this study is to compare the prediction accuracy of enhanced heart disease using various optimization algorithms. Heart disease treatment requires ensemble hyperparameter tuning for accurate prediction and classification due to multiple feature dependencies. The study analyzed model tuning techniques using the AUC and confusion matrix, revealing improvements in precision, recall, and f1 score from default to optimized models. The Hyper-opt in Bayesian optimizer and T-pot classifiers were used in genetic populations and offspring with 5 and 10 generations, while using Optuna optimization frozen trails was combined with a random forest algorithm. The default random forest (86.6%), Bayesian optimization with random forest (89%), and Bayesian optimization with support vector machines (90%) scored the highest accuracy among all. The generic algorithm with five generations (86.8%) and GAsearchCV with 10 generations (88.5%) scored the second highest accuracy, while Optuna's support vector machine model (84%) scored the least accuracy, respectively. This research further compares the machine learning accuracy, precision, recall, F1 score, macro average, and confusion matrix of each optimized model with their model's actual performance execution time. The predictive accuracy from exploratory data analysis and data pre-processing was further tested after the pipeline design of one-hot encoding and standard scaling of enhanced (31-featured) data sets and heart disease data (13 features). The gaussian algorithm (84%), logistic regression (83%), and classification models predict with higher accuracy than dummy classifiers (54%), when compared with standalone default machine learning models. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13807501
Volume :
83
Issue :
18
Database :
Complementary Index
Journal :
Multimedia Tools & Applications
Publication Type :
Academic Journal
Accession number :
177251018
Full Text :
https://doi.org/10.1007/s11042-023-17273-x