1. 414. Developing Digital Phenotypes of Primary Immune Deficiencies Using Machine Learning on a Large Electronic Health Record Database.
- Author
-
Meister, Leo, Zerbe, Christa, Notarangelo, Luigi D, Kadri, Sameer S, Prevots, D Rebecca, and Ricotta, Emily
- Subjects
- *
PRIMARY immunodeficiency diseases , *ELECTRONIC health records , *IMMUNODEFICIENCY , *GENETIC disorders , *MACHINE learning , *EDUCATIONAL technology - Abstract
Background More than 350 genetic disorders cause immune deficiencies; given the rarity of these conditions, in-depth study of infections associated with primary immune deficiencies (PID) requires extremely large sample sizes from broad populations. Using a large electronic health record (EHR) dataset, we linked clinical and microbiologic data to develop digital phenotypes for PID. Methods Using the Cerner HealthFacts EHR dataset from 2009 to 2017 we extracted clinical and microbiologic data for hospitalizations from patients <18 years old with ICD9/10 PID diagnoses and ≥1 positive culture for infection. Machine learning models were used to identify key features to predict PID diagnosis. Features included patient and hospitalization characteristics; infectious agent and infection site; and selected comorbidities. Model validation was done using the area under the receiver operating characteristic (AUC) curve. Results Overall 1316 patients with a PID were identified (Table 1). The 10 most common pathogens identified by PID are listed in Table 2. The models classified DiGeorge syndrome (positive predictive value 49%), functional disorders of polymorphonuclear neutrophils (PMN) (PPV 43%), and common variable immunodeficiency (CVID) (PPV 47%) better than combined immunodeficiency (CID) (PPV 20%); the overall true positive rate was 47% with an AUC of 0.73. Predictive features for each PID were as follows: CVID—having enteritis, hypertension, and pneumonia (Figure 1a); PMN—having hypoxia and hypertension (Figure 1b); DiGeorge syndrome—having congenital deformities and not having hypertension (Figure 1c); CID—finding Staphylococcus aureus in a wound or Escherichia coli in the blood were predictive of CID (Figure 1d). Conclusion Early models demonstrate some discrimination, specifically for more common PIDs (CVID) and those with highly identifying factors (DiGeorge syndrome). These models can be improved by including a wider array of clinical data, and they provide a first look at a new methodology to digitally phenotype PIDs for future diagnostic use. Disclosures All authors: No reported disclosures. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF