Back to Search Start Over

Open-source machine learning pipeline automatically flags instances of acute respiratory distress syndrome from electronic health records.

Authors :
Morales FL
Xu F
Lee HA
Navarro HT
Bechel MA
Cameron EL
Kelso J
Weiss CH
Nunes Amaral LA
Source :
MedRxiv : the preprint server for health sciences [medRxiv] 2024 May 26. Date of Electronic Publication: 2024 May 26.
Publication Year :
2024

Abstract

Physicians could greatly benefit from automated diagnosis and prognosis tools to help address information overload and decision fatigue. Intensive care physicians stand to benefit greatly from such tools as they are at particularly high risk for those factors. Acute Respiratory Distress Syndrome (ARDS) is a life-threatening condition affecting >10% of critical care patients and has a mortality rate over 40%. However, recognition rates for ARDS have been shown to be low (30-70%) in clinical settings. In this work, we present a reproducible computational pipeline that automatically adjudicates ARDS on retrospective datasets of mechanically ventilated adult patients. This pipeline automates the steps outlined by the Berlin Definition through implementation of natural language processing tools and classification algorithms. We train an XGBoost model on chest imaging reports to detect bilateral infiltrates, and another on a subset of attending physician notes labeled for the most common ARDS risk factor in our data. Both models achieve high performance-a minimum area under the receiver operating characteristic curve (AUROC) of 0.86 for adjudicating chest imaging reports in out-of-bag test sets, and an out-of-bag AUROC of 0.85 for detecting a diagnosis of pneumonia. We validate the entire pipeline on a cohort of MIMIC-III encounters and find a sensitivity of 93.5% - an extraordinary improvement over the 22.6% ARDS recognition rate reported for these encounters - along with a specificity of 73.9%. We conclude that our reproducible, automated diagnostic pipeline exhibits promising accuracy, generalizability, and probability calibration, thus providing a valuable resource for physicians aiming to enhance ARDS diagnosis and treatment strategies. We surmise that proper implementation of the pipeline has the potential to aid clinical practice by facilitating the recognition of ARDS cases at scale.

Details

Language :
English
Database :
MEDLINE
Journal :
MedRxiv : the preprint server for health sciences
Accession number :
38826348
Full Text :
https://doi.org/10.1101/2024.05.21.24307715