Back to Search Start Over

11 Novel Systematic Method for Identifying Congenital Anomaly Cases in Electronic Health Record Databases

Authors :
Elly Brokamp
Lisa Bastarache
Nancy Cox
Rizwan Hamid
Nikhil K. Khanakari
Gillian Hooker
Megan Shuey
Source :
Journal of Clinical and Translational Science, Vol 8, Pp 3-3 (2024)
Publication Year :
2024
Publisher :
Cambridge University Press, 2024.

Abstract

OBJECTIVES/GOALS: Congenital anomalies (CAs) affect 3% of live births, yet the cause of 80% of CAs is unknown and for the 20% with an identified cause, variability in penetrance suggests additional risk drivers exist. Our method for identifying and categorizing CAs in electronic health record (EHR) linked biobank databases can expand and improve CA etiologic research. METHODS/STUDY POPULATION: We identified individuals with CAs in three groups: 1. Those with at least one CA 2. Those with multiple CAs (MCA), those with two or more ‘major’ CAs, and 3. Those with CAs in a specific organ system. We also created a novel quantitative approach, using phenome-wide association studies (pheWAS), for determining CA-associated genetic disease billing codes in order to separate individuals that have a known genetic cause for their CAs from those with idiopathic CAs. We updated CA phecodes, aggregates of clinical billing codes, which we used to identify CA cases in Vanderbilt’s EHR-linked biobank database, BioVU. We create a new phecode, ‘All CAs’, for researchers to quickly identify all individuals with at least one CA. We evaluate the definition of MCA using pheWAS analyses to compare ‘minor’ vs ‘major’ CA. RESULTS/ANTICIPATED RESULTS: The new CA phecode nomenclature includes 5.8 times more codes for CAs compared with the previous version (365 vs 56), improving granularity. 85 (19.7%) CA-associated genetic disease billing codes were identified through literature review. PheWAS analyses revealed an additional 16 (3.7%) genetic disease billing codes with one or more significant (p< 2.75 x10-5) association with CA-related phecodes. Identifying CA-associated genetic disease billing codes allows researchers to differentiate between idiopathic CAs and those that have a known genetic cause. PheWAS analyses of individuals with previously considered “minor” CAs showed many associated severe health problems, revealing that the differentiation between “minor” vs “major” CAs when identifying individuals with MCA in the EHR is arbitrary. DISCUSSION/SIGNIFICANCE: Our CA identification method is scalable for the growing number of EHR-linked biobanks. Differentiating between idiopathic CAs from those with known causes will increase power in studies discovering additional genetic drivers of CAs. Our novel method allows for expansion and acceleration of CA epidemiological research in EHR-linked biobank data.

Subjects

Subjects :
Medicine

Details

Language :
English
ISSN :
20598661
Volume :
8
Database :
Directory of Open Access Journals
Journal :
Journal of Clinical and Translational Science
Publication Type :
Academic Journal
Accession number :
edsdoj.5fb762e36fbd42b8ae887315d9edea30
Document Type :
article
Full Text :
https://doi.org/10.1017/cts.2024.32