Back to Search Start Over

A Cardiovascular Disease Prediction Model Based on Routine Physical Examination Indicators Using Machine Learning Methods: A Cohort Study

Authors :
Xin Qian
Yu Li
Xianghui Zhang
Heng Guo
Jia He
Xinping Wang
Yizhong Yan
Jiaolong Ma
Rulin Ma
Shuxia Guo
Source :
Frontiers in Cardiovascular Medicine, Vol 9 (2022)
Publication Year :
2022
Publisher :
Frontiers Media S.A., 2022.

Abstract

BackgroundCardiovascular diseases (CVD) are currently the leading cause of premature death worldwide. Model-based early detection of high-risk populations for CVD is the key to CVD prevention. Thus, this research aimed to use machine learning (ML) algorithms to establish a CVD prediction model based on routine physical examination indicators suitable for the Xinjiang rural population.MethodThe research cohort data collection was divided into two stages. The first stage involved a baseline survey from 2010 to 2012, with follow-up ending in December 2017. The second-phase baseline survey was conducted from September to December 2016, and follow-up ended in August 2021. A total of 12,692 participants (10,407 Uyghur and 2,285 Kazak) were included in the study. Screening predictors and establishing variable subsets were based on least absolute shrinkage and selection operator (Lasso) regression, logistic regression forward partial likelihood estimation (FLR), random forest (RF) feature importance, and RF variable importance. The selected subset of variables was compared with L1 regularized logistic regression (L1-LR), RF, support vector machine (SVM), and AdaBoost algorithm to establish a CVD prediction model suitable for this population. The incidence of CVD in this population was then analyzed.ResultAfter 4.94 years of follow-up, a total of 1,176 people were diagnosed with CVD (cumulative incidence: 9.27%). In the comparison of discrimination and calibration, the prediction performance of the subset of variables selected based on FLR was better than that of other models. Combining the results of discrimination, calibration, and clinical validity, the prediction model based on L1-LR had the best prediction performance. Age, systolic blood pressure, low-density lipoprotein-L/high-density lipoproteins-C, triglyceride blood glucose index, body mass index, and body adiposity index were all important predictors of the onset of CVD in the Xinjiang rural population.ConclusionIn the Xinjiang rural population, the prediction model based on L1-LR had the best prediction performance.

Details

Language :
English
ISSN :
2297055X and 91997984
Volume :
9
Database :
Directory of Open Access Journals
Journal :
Frontiers in Cardiovascular Medicine
Publication Type :
Academic Journal
Accession number :
edsdoj.00598f66ac5a49fe873ce91997984875
Document Type :
article
Full Text :
https://doi.org/10.3389/fcvm.2022.854287