Back to Search Start Over

Generative Bayesian modeling to nowcast the effective reproduction number from line list data with missing symptom onset dates.

Authors :
Lison, Adrian
Abbott, Sam
Huisman, Jana
Stadler, Tanja
Source :
PLoS Computational Biology; 4/16/2024, Vol. 20 Issue 4, p1-32, 32p
Publication Year :
2024

Abstract

The time-varying effective reproduction number R<subscript>t</subscript> is a widely used indicator of transmission dynamics during infectious disease outbreaks. Timely estimates of R<subscript>t</subscript> can be obtained from reported cases counted by their date of symptom onset, which is generally closer to the time of infection than the date of report. Case counts by date of symptom onset are typically obtained from line list data, however these data can have missing information and are subject to right truncation. Previous methods have addressed these problems independently by first imputing missing onset dates, then adjusting truncated case counts, and finally estimating the effective reproduction number. This stepwise approach makes it difficult to propagate uncertainty and can introduce subtle biases during real-time estimation due to the continued impact of assumptions made in previous steps. In this work, we integrate imputation, truncation adjustment, and R<subscript>t</subscript> estimation into a single generative Bayesian model, allowing direct joint inference of case counts and R<subscript>t</subscript> from line list data with missing symptom onset dates. We then use this framework to compare the performance of nowcasting approaches with different stepwise and generative components on synthetic line list data for multiple outbreak scenarios and across different epidemic phases. We find that under reporting delays realistic for hospitalization data (50% of reports delayed by more than a week), intermediate smoothing, as is common practice in stepwise approaches, can bias nowcasts of case counts and R<subscript>t</subscript>, which is avoided in a joint generative approach due to shared regularization of all model components. On incomplete line list data, a fully generative approach enables the quantification of uncertainty due to missing onset dates without the need for an initial multiple imputation step. In a real-world comparison using hospitalization line list data from the COVID-19 pandemic in Switzerland, we observe the same qualitative differences between approaches. The generative modeling components developed in this work have been integrated and further extended in the R package epinowcast, providing a flexible and interpretable tool for real-time surveillance. Author summary: During an infectious disease outbreak, public health authorities require timely indicators of transmission dynamics, such as the effective reproduction number R<subscript>t</subscript>. Since reporting data are delayed and often incomplete, statistical methods must be employed to obtain real-time estimates of case numbers and R<subscript>t</subscript>. Existing methods involve separate steps for imputing missing data, adjusting for reporting delays, and estimating R<subscript>t</subscript>. This stepwise approach impedes uncertainty quantification and can lead to inconsistent smoothing assumptions across steps. In this paper, we propose an alternative approach based on generative Bayesian modeling which integrates all steps into a single nowcasting model that can be directly fit to observed data. Using synthetic and real-world line list data, we demonstrate that the generative approach better captures uncertainty and avoids bias from inconsistent assumptions. The model components of our approach have been integrated into the R package epinowcast for easy use in practice. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
1553734X
Volume :
20
Issue :
4
Database :
Complementary Index
Journal :
PLoS Computational Biology
Publication Type :
Academic Journal
Accession number :
176632555
Full Text :
https://doi.org/10.1371/journal.pcbi.1012021