Back to Search Start Over

Nonexercise machine learning models for maximal oxygen uptake prediction in national population surveys.

Authors :
Liu, Yuntian
Herrin, Jeph
Huang, Chenxi
Khera, Rohan
Dhingra, Lovedeep Singh
Dong, Weilai
Mortazavi, Bobak J
Krumholz, Harlan M
Lu, Yuan
Source :
Journal of the American Medical Informatics Association; May2023, Vol. 30 Issue 5, p943-952, 10p, 1 Diagram, 1 Chart, 2 Graphs
Publication Year :
2023

Abstract

Objective Nonexercise algorithms are cost-effective methods to estimate cardiorespiratory fitness (CRF), but the existing models have limitations in generalizability and predictive power. This study aims to improve the nonexercise algorithms using machine learning (ML) methods and data from US national population surveys. Materials and Methods We used the 1999–2004 data from the National Health and Nutrition Examination Survey (NHANES). Maximal oxygen uptake (VO<subscript>2</subscript> max), measured through a submaximal exercise test, served as the gold standard measure for CRF in this study. We applied multiple ML algorithms to build 2 models: a parsimonious model using commonly available interview and examination data, and an extended model additionally incorporating variables from Dual-Energy X-ray Absorptiometry (DEXA) and standard laboratory tests in clinical practice. Key predictors were identified using Shapley additive explanation (SHAP). Results Among the 5668 NHANES participants in the study population, 49.9% were women and the mean (SD) age was 32.5 years (10.0). The light gradient boosting machine (LightGBM) had the best performance across multiple types of supervised ML algorithms. Compared with the best existing nonexercise algorithms that could be applied to the NHANES, the parsimonious LightGBM model (RMSE: 8.51 ml/kg/min [95% CI: 7.73–9.33]) and the extended LightGBM model (RMSE: 8.26 ml/kg/min [95% CI: 7.44–9.09]) significantly reduced the error by 15% and 12% (P  < .001 for both), respectively. Discussion The integration of ML and national data source presents a novel approach for estimating cardiovascular fitness. This method provides valuable insights for cardiovascular disease risk classification and clinical decision-making, ultimately leading to improved health outcomes. Conclusion Our nonexercise models provide improved accuracy in estimating VO<subscript>2</subscript> max within NHANES data as compared to existing nonexercise algorithms. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10675027
Volume :
30
Issue :
5
Database :
Complementary Index
Journal :
Journal of the American Medical Informatics Association
Publication Type :
Academic Journal
Accession number :
163279492
Full Text :
https://doi.org/10.1093/jamia/ocad035