1. Electronic Health Records-based identification of newly diagnosed Crohn's Disease cases.
- Author
-
Ibing S, Hugo J, Borchert F, Schmidt L, Benson C, Marshall AA, Chasteau C, Korie U, Paguay D, Sachs JP, Renard BY, Cho JH, Böttinger EP, and Ungaro RC
- Subjects
- Humans, Female, Male, Adult, Case-Control Studies, Middle Aged, Risk Factors, Young Adult, Crohn Disease diagnosis, Electronic Health Records, Algorithms
- Abstract
Background: Early diagnosis and treatment of Crohn's Disease are associated with decreased risk of surgery and complications. However, diagnostic delay is frequently seen in clinical practice. To better understand Crohn's Disease risk factors and disease indicators, we identified, described, and predicted incident Crohn's Disease patients based on the Electronic Health Record data of the Mount Sinai Health System., Methods: We developed two phenotyping algorithms based on structured Electronic Health Record data (i.e., coded diagnosis, medication prescription, and healthcare utilization), and a more simple and advanced approach of information extraction from clinical notes, including data between 2011 and 2023. We conducted an ablation study for the classification task using different models, prediction time points, data inputs, text encoding methods, and case-control matching variables., Results: We identified 247 incident Crohn's Disease cases and 1221 matched controls and validated our cohorts through manual chart review. A second control cohort (n = 1235) was created without matching on race. Gastrointestinal symptoms were significantly overrepresented in cases at least 180 days before the first coded Crohn's Disease diagnosis. Adding text-based features to the clinical prediction models increased their overall performances. However, adding race as a matching variable had more effects on the model performance than the choice of modeling algorithm or input data, with an area under the receiver operating characteristic difference of 0.09 between the best-performing models., Conclusion: We demonstrate the feasibility of identifying newly diagnosed Crohn's Disease patients within a United States health system using Electronic Health Records. For the predictive modeling task, cases and controls were distinguished only with modest performance, even though various state-of-the-art methods were applied based on features from structured and unstructured data. Our findings suggest the benefit of adding information from clinical notes in a supervised or unsupervised manner for cohort creation and predictive modeling., Competing Interests: Declaration of competing interest RCU has served as a consultant and/or advisory board member for AbbVie, Bristol Myers Squibb, Celltrion, Inotrem, Lilly, Janssen, Pfizer, Roivant, Takeda. BYR has served as a consultant and/or holds intellectual property rights commercialized by Seqstant, Biontech, Genentech (Roche). The remaining authors declare no conflict of interest., (Copyright © 2024 Elsevier B.V. All rights reserved.)
- Published
- 2025
- Full Text
- View/download PDF