1. MACE prediction using high-dimensional machine learning and mechanistic interpretation: A longitudinal cohort study in US veterans
- Author
-
Sayera Dhaubhadel, Beauty Kolade, Ruy M. Ribeiro, Kumkum Ganguly, Nicolas W. Hengartner, Tanmoy Bhattacharya, Judith D. Cohn, Khushbu Agarwal, Kelly Cho, Lauren Costa, Yuk-Lam Ho, Allison E. Murata, Glen H. Murata, Jason L. Vassy, Daniel C. Posner, J. Michael Gaziano, Yan V. Sun, Peter W. Wilson, Ravi Madduri, Amy C. Justice, Phil Tsao, Christopher J. O’Donnell, Scott Damrauer, and Benjamin H. McMahon
- Abstract
High dimensional predictive models of Major Adverse Cardiac Events (MACE), which includes heart attack (AMI), stroke, and death caused by cardiovascular disease (CVD), were built using four longitudinal cohorts of Veterans Administration (VA) patients created from VA medical records. We considered 247 variables / risk factors measured across 7.5 years for millions of patients in order to compare predictions for the first reported MACE event using six distinct modelling methodologies. The best-performing methodology varied across the four cohorts. Model coefficients related to disease pathophysiology and treatment were relatively constant across cohorts, while coefficients dependent upon the confounding variables of age and healthcare utilization varied considerably across cohorts. In particular, models trained on a retrospective case-control (Rcc) cohort (where controls are matched to cases by date of birth cohort and overall level of healthcare utilization) emphasize variables describing pathophysiology and treatment, while predictions based on the cohort of all active patients at the start of 2017 (C-17) rely much more on age and variables reflecting healthcare utilization. In consequence, directly using an Rcc-trained model to evaluate the C-17 cohort resulted in poor performance (C-statistic = 0.65). However, a simple reoptimization of model dependence on age, demographics, and five other variables improved the C-statistic to 0.74, nearly matching the 0.76 obtained on C-17 by a C-17-trained model. Dependence of MACE risk on biomarkers for hypertension, cholesterol, diabetes, body mass index, and renal function in our models was consistent with the literature. At the same time, including medications and procedures provided important indications of both disease severity and the level of treatment. More detailed study designs will be required to disentangle these effects.
- Published
- 2022