Back to Search Start Over

The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction

Authors :
John Muschelli
Mark M. Bissell
Melissa A. Haendel
Seth Russell
Adit Anand
Anita Walden
Richard A. Moffitt
Michele I. Morris
Eli Levitt
Davera Gabriel
Brian T. Garibaldi
Tellen D. Bennett
Benjamin Amor
Heidi Spratt
Kenneth R. Gersing
Amin Manna
Janos Hajagos
Matvey B. Palchuk
Richard L Zhu
Nabeel Qureshi
Julie A. McMurry
Carolyn Bremer
Alina Denham
Elaine L. Hill
Kristin Kostka
Andrew T Girvin
Katie Rebecca Bradwell
Justin Guinney
Jacob T. Wooldridge
Andrew E. Williams
Emily R. Pfaff
Yun Jae Yoo
Stephanie S Hong
James Brian Byrd
Xiaohan Tanner Zhang
Joel H. Saltz
Peter E. DeWitt
Zhenglong Qian
Christopher P. Austin
Andrew J. Neumann
Christopher G. Chute
Hunter Jimenez
Ramakanth Kavuluru
Harold P Lehmann
Sandeep K. Mallipattu
Source :
medRxiv
Publication Year :
2021
Publisher :
Cold Spring Harbor Laboratory, 2021.

Abstract

BackgroundThe majority of U.S. reports of COVID-19 clinical characteristics, disease course, and treatments are from single health systems or focused on one domain. Here we report the creation of the National COVID Cohort Collaborative (N3C), a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative U.S. cohort of COVID-19 cases and controls to date. This multi-center dataset supports robust evidence-based development of predictive and diagnostic tools and informs critical care and policy.Methods and FindingsIn a retrospective cohort study of 1,926,526 patients from 34 medical centers nationwide, we stratified patients using a World Health Organization COVID-19 severity scale and demographics; we then evaluated differences between groups over time using multivariable logistic regression. We established vital signs and laboratory values among COVID-19 patients with different severities, providing the foundation for predictive analytics. The cohort included 174,568 adults with severe acute respiratory syndrome associated with SARS-CoV-2 (PCR >99% or antigen ConclusionsThis is the first description of an ongoing longitudinal observational study of patients seen in diverse clinical settings and geographical regions and is the largest COVID-19 cohort in the United States. Such data are the foundation for ML models that can be the basis for generalizable clinical decision support tools. The N3C Data Enclave is unique in providing transparent, reproducible, easily shared, versioned, and fully auditable data and analytic provenance for national-scale patient-level EHR data. The N3C is built for intensive ML analyses by academic, industry, and citizen scientists internationally. Many observational correlations can inform trial designs and care guidelines for this new disease.

Details

Database :
OpenAIRE
Journal :
medRxiv
Accession number :
edsair.doi.dedup.....117f939a651374d146f14ec0940d1530