Back to Search Start Over

Accuracy of Computable Phenotyping Approaches for SARS-CoV-2 Infection and COVID-19 Hospitalizations from the Electronic Health Record

Authors :
Benjamin D. Pollock
Rohan Khera
William G. Jenkinson
Albert I. Ko
David R. Peaper
Richard A. Martinello
Cynthia Brandt
H. Patrick Young
Frederick Warner
Joseph S. Ross
Bobak J. Mortazavi
Wade L. Schulz
Harlan M. Krumholz
Veer Sangha
Nilay Shah
Zhenqiu Lin
Camille A Knepper
Karen H. Wang
Elitza S. Theel
Source :
medRxiv, article-version (status) pre, article-version (number) 3
Publication Year :
2021
Publisher :
Cold Spring Harbor Laboratory, 2021.

Abstract

ObjectiveReal-world data have been critical for rapid-knowledge generation throughout the COVID-19 pandemic. To ensure high-quality results are delivered to guide clinical decision making and the public health response, as well as characterize the response to interventions, it is essential to establish the accuracy of COVID-19 case definitions derived from administrative data to identify infections and hospitalizations.MethodsElectronic Health Record (EHR) data were obtained from the clinical data warehouse of the Yale New Haven Health System (Yale, primary site) and 3 hospital systems of the Mayo Clinic (validation site). Detailed characteristics on demographics, diagnoses, and laboratory results were obtained for all patients with either a positive SARS-CoV-2 PCR or antigen test or ICD-10 diagnosis of COVID-19 (U07.1) between April 1, 2020 and March 1, 2021. Various computable phenotype definitions were evaluated for their accuracy to identify SARS-CoV-2 infection and COVID-19 hospitalizations.ResultsOf the 69,423 individuals with either a diagnosis code or a laboratory diagnosis of a SARS-CoV-2 infection at Yale, 61,023 had a principal or a secondary diagnosis code for COVID-19 and 50,355 had a positive SARS-CoV-2 test. Among those with a positive laboratory test, 38,506 (76.5%) and 3449 (6.8%) had a principal and secondary diagnosis code of COVID-19, respectively, while 8400 (16.7%) had no COVID-19 diagnosis. Moreover, of the 61,023 patients with a COVID-19 diagnosis code, 19,068 (31.2%) did not have a positive laboratory test for SARS-CoV-2 in the EHR. Of the 20 cases randomly sampled from this latter group for manual review, all had a COVID-19 diagnosis code related to asymptomatic testing with negative subsequent test results. The positive predictive value (precision) and sensitivity (recall) of a COVID-19 diagnosis in the medical record for a documented positive SARS-CoV-2 test were 68.8% and 83.3%, respectively. Among 5,109 patients who were hospitalized with a principal diagnosis of COVID-19, 4843 (94.8%) had a positive SARS-CoV-2 test within the 2 weeks preceding hospital admission or during hospitalization. In addition, 789 hospitalizations had a secondary diagnosis of COVID-19, of which 446 (56.5%) had a principal diagnosis consistent with severe clinical manifestation of COVID-19 (e.g., sepsis or respiratory failure). Compared with the cohort that had a principal diagnosis of COVID-19, those with a secondary diagnosis had a more than 2-fold higher in-hospital mortality rate (13.2% vs 28.0%, PConclusionsCOVID-19 diagnosis codes misclassified the SARS-CoV-2 infection status of many people, with implications for clinical research and epidemiological surveillance. Moreover, the codes had different performance across two academic health systems and identified groups with different risks of mortality. Real-world data from the EHR can be used to in conjunction with diagnosis codes to improve the identification of people infected with SARS-CoV-2.

Details

Database :
OpenAIRE
Journal :
medRxiv, article-version (status) pre, article-version (number) 3
Accession number :
edsair.doi.dedup.....8f974fe836c9c050fa95b31c03898518