Back to Search Start Over

Pandemic velocity: Forecasting COVID-19 in the US with a machine learning & Bayesian time series compartmental model

Authors :
Phillip Sundin
Marc A. Suchard
Joseph A. Zoller
Gregory L. Watson
Lu Zhang
Teresa Bufford
Di Xiong
John Shamshoian
Anne W. Rimoin
Christina M. Ramirez
Pitzer, Virginia E
Source :
PLoS Computational Biology, PLoS computational biology, vol 17, iss 3, PLoS Computational Biology, Vol 17, Iss 3, p e1008837 (2021)
Publication Year :
2021
Publisher :
Public Library of Science, 2021.

Abstract

Predictions of COVID-19 case growth and mortality are critical to the decisions of political leaders, businesses, and individuals grappling with the pandemic. This predictive task is challenging due to the novelty of the virus, limited data, and dynamic political and societal responses. We embed a Bayesian time series model and a random forest algorithm within an epidemiological compartmental model for empirically grounded COVID-19 predictions. The Bayesian case model fits a location-specific curve to the velocity (first derivative) of the log transformed cumulative case count, borrowing strength across geographic locations and incorporating prior information to obtain a posterior distribution for case trajectories. The compartmental model uses this distribution and predicts deaths using a random forest algorithm trained on COVID-19 data and population-level characteristics, yielding daily projections and interval estimates for cases and deaths in U.S. states. We evaluated the model by training it on progressively longer periods of the pandemic and computing its predictive accuracy over 21-day forecasts. The substantial variation in predicted trajectories and associated uncertainty between states is illustrated by comparing three unique locations: New York, Colorado, and West Virginia. The sophistication and accuracy of this COVID-19 model offer reliable predictions and uncertainty estimates for the current trajectory of the pandemic in the U.S. and provide a platform for future predictions as shifting political and societal responses alter its course.<br />Author summary COVID-19 models can be roughly classified as mathematical models that simulate disease within a population, including epidemiological compartmental models, or statistical curve-fitting models that fit a function to observed data and extrapolate forward into the future. Bridging this divide, we combine the strengths of curve-fitting statistical models and the structure of epidemiological models, by embedding a Bayesian velocity model and a machine learning algorithm (random forest) into the framework of a compartmental model. Fusing these models together exploits the particular strengths of each to glean as much information as possible from the currently available data. We identify the velocity of log cumulative cases as an excellent target for modeling and extrapolating COVID-19 case trajectories. We empirically evaluate the predictive performance of the model and provide predicted trajectories with credible intervals for cumulative confirmed case count, active confirmed infections and COVID-19 deaths for each of the 50 U.S. states. Combining sophisticated data analytic methods with proven epidemiological models offers an empirically grounded strategy for making realistic predictions and quantifying their uncertainty. These predictions indicate substantial variation in the COVID-19 trajectories of U.S. states.

Details

Language :
English
ISSN :
15537358 and 1553734X
Volume :
17
Issue :
3
Database :
OpenAIRE
Journal :
PLoS Computational Biology
Accession number :
edsair.doi.dedup.....61bf722494279ffbde138f19f5f397c7