101. Inferring global-scale temporal latent topics from news reports to predict public health interventions for COVID-19
- Author
-
Zhi Wen, David L. Buckeridge, Guido Powell, Yue Li, and Imane Chafi
- Subjects
Topic model ,2019-20 coronavirus outbreak ,Government ,Coronavirus disease 2019 (COVID-19) ,Computer science ,Public health interventions ,Psychological intervention ,COVID-19 ,General Decision Sciences ,transfer learning ,Data science ,Article ,public health surveillance ,latent topic models ,non-pharmacological interventions ,Scale (social sciences) ,variational autoencoder ,Classifier (UML) - Abstract
The COVID-19 pandemic has highlighted the importance of non-pharmacological interventions (NPIs) for controlling epidemics of emerging infectious diseases. Despite their importance, NPIs have been monitored mainly through the manual efforts of volunteers. This approach hinders measurement of the NPI effectiveness and development of evidence to guide their use to control the global pandemic. We present EpiTopics, a machine learning approach to support automation of NPI prediction and monitoring at both the document level and country level by mining the vast amount of unlabeled news reports on COVID-19. EpiTopics uses a 3-stage, transfer-learning algorithm to classify documents according to NPI categories, relying on topic modeling to support result interpretation. We identified 25 interpretable topics under 4 distinct and coherent COVID-related themes. Importantly, the use of these topics resulted in significant improvements over alternative automated methods in predicting the NPIs in labeled documents and in predicting country-level NPIs for 42 countries., The bigger picture Accurate, scalable detection of the timing of changes to public health interventions for COVID-19 is an important step toward automating evaluation of the effectiveness of interventions. We show that it is possible to train an interpretable deep-learning model called EpiTopics on media news data to predict (1) the interventions mentioned in individual news articles and (2) the temporal change of intervention status at the country level. We addressed a main challenge of label scarcity among the news reports. Using EpiTopics, we modeled the latent semantics from 1.2 million unlabeled news reports on COVID-19 over 42 countries recorded from November 1, 2019 to July 31, 2020, identifying 25 interpretable topics under 4 COVID-related themes. Using the learned topic model, we inferred topic mixture membership for each labeled article, which allowed us to learn an accurate connection between the topics and the public health interventions at both the document level and country level., We developed a machine learning model called EpiTopics toward automatic detection of the changes in the status of non-pharmacological interventions (NPI) for COVID-19 from news reports. EpiTopics learns country-dependent topics from large numbers of COVID-19 news reports that do not have NPI labels. Subsequently, EpiTopics learns accurate connections between these topics and changes in NPI status from a set of labeled news reports, which enables accurate detection of temporal NPI status for each country referred to in the news reports.
- Published
- 2021