1. A comprehensive comparison of machine learning models for ICH prognostication: Retrospective review of 1501 intra-cerebral hemorrhage patients from the Qatar stroke database.
- Author
-
Ali, Aizaz, Ayub, Umar T., Gharaibeh, Khaled, Rao, Rahul, Akhtar, Naveed, Jumaa, Mouhammad, and Shuaib, Ashfaq
- Subjects
- *
MACHINE learning , *RANDOM forest algorithms , *DATABASES , *CEREBRAL hemorrhage , *PROGNOSTIC models - Abstract
Multiple prognostic scores have been developed to predict morbidity and mortality in patients with spontaneous intracerebral hemorrhage(sICH). Since the advent of machine learning(ML), different ML models have also been developed for sICH prognostication. There is however a need to verify the validity of these ML models in diverse patient populations. We aim to create machine learning models for prognostication purposes in the Qatari population. By incorporating inpatient variables into model development, we aim to leverage more information. 1501 consecutive patients with acute sICH admitted to Hamad General Hospital(HGH) between 2013 and 2023 were included. We trained, evaluated, and compared several ML models to predict 90-day mortality and functional outcomes. For our dataset, we randomly selected 80% patients for model training and 20% for validation and used k-fold cross validation to train our models. The ML workflow included imbalanced class correction and dimensionality reduction in order to evaluate the effect of each. Evaluation metrics such as sensitivity, specificity, F-1 score were calculated for each prognostic model. Mean age was 50.8(SD 13.1) years and 1257(83.7%) were male. Median ICH volume was 7.5 ml(IQR 12.6). 222(14.8%) died while 897(59.7%) achieved good functional outcome at 90 days. For 90-day mortality, random forest(RF) achieved highest AUC(0.906) whereas for 90-day functional outcomes, logistic regression(LR) achieved highest AUC(0.888). Ensembling provided similar results to the best performing models, namely RF and LR, obtaining an AUC of 0.904 for mortality and 0.883 for functional outcomes. Random Forest achieved the highest AUC for 90-day mortality, and LR achieved the highest AUC for 90-day functional outcomes. Comparing ML models, there is minimal difference between their performance. By creating an ensemble of our best performing individual models we maintained maximum accuracy and decreased variance of functional outcome and mortality prediction when compared with individual models. Key points: • 1st Key Point: We are comparing different machine learning models for ICH prognostication. Random Forest was the most accurate model for mortality, and logistic regression was the most accurate model for functional outcomes. • 2nd Key Point: Ensembling models maintained accuracy and decreased variance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF