Back to Search Start Over

A machine learning approach to identifying patients with pulmonary hypertension using real-world electronic health records

Authors :
Emily Kogan
Eva-Maria Didden
Eileen Lee
Anderson Nnewihe
Dimitri Stamatiadis
Samson Mataraso
Deborah Quinn
Daniel Rosenberg
Christel Chehoud
Charles Bridges
Source :
International Journal of Cardiology. 374:95-99
Publication Year :
2023
Publisher :
Elsevier BV, 2023.

Abstract

This study aimed to develop a machine learning (ML) model to identify patients who are likely to have pulmonary hypertension (PH), using a large patient-level US-based electronic health record (EHR) database.A gradient boosting model, XGBoost, was developed using data from Optum's US-based de-identified EHR dataset (2007-2019). PH and disease control adult patients were identified using diagnostic, treatment and procedure codes and were randomly split into the training (90%) or test set (10%). Model features included patient demographics, physician visits, diagnoses, procedures, prescriptions, and laboratory test results. Shapley Additive exPlanations values were used to determine feature importance.We identified 11,279,478 control and 115,822 PH patients (mean age, respectively: 62 and 68 years, both 53% female). The final model used 165 features, with the most important predictive features including diagnosis of heart failure, shortness of breath and atrial fibrillation. The model predicted PH with an area under the receiver operating characteristic curve (AUROC) of 0.92. AUROC remained above 0.80 for the prediction of PH up to and beyond 18 months before diagnosis. Among the PH patients, we also identified 955 pulmonary arterial hypertension (PAH) and 1432 chronic thromboembolic pulmonary hypertension (CTEPH) patients, and the range of AUROCs obtained for these cohorts was 0.79-0.90 and 0.87-0.96, respectively.This model to detect PH based on patients' EHR records is viable and performs well in subgroups of PAH and CTEPH patients. This approach has the potential to improve patient outcomes by reducing diagnostic delay in PH.

Details

ISSN :
01675273
Volume :
374
Database :
OpenAIRE
Journal :
International Journal of Cardiology
Accession number :
edsair.doi.dedup.....084f89749c51386f16426d1ac8450e65
Full Text :
https://doi.org/10.1016/j.ijcard.2022.12.016