Back to Search
Start Over
Cumulative Query Method for Influenza Surveillance Using Search Engine Data
- Source :
- Journal of Medical Internet Research, Journal of Medical Internet Research, Vol 16, Iss 12, p e289 (2014)
- Publication Year :
- 2014
- Publisher :
- JMIR Publications Inc., 2014.
-
Abstract
- Background: Internet search queries have become an important data source in syndromic surveillance system. However, there is currently no syndromic surveillance system using Internet search query data in South Korea. Objectives: The objective of this study was to examine correlations between our cumulative query method and national influenza surveillance data. Methods: Our study was based on the local search engine, Daum (approximately 25% market share), and influenza-like illness (ILI) data from the Korea Centers for Disease Control and Prevention. A quota sampling survey was conducted with 200 participants to obtain popular queries. We divided the study period into two sets: Set 1 (the 2009/10 epidemiological year for development set 1 and 2010/11 for validation set 1) and Set 2 (2010/11 for development Set 2 and 2011/12 for validation Set 2). Pearson’s correlation coefficients were calculated between the Daum data and the ILI data for the development set. We selected the combined queries for which the correlation coefficients were .7 or higher and listed them in descending order. Then, we created a cumulative query method n representing the number of cumulative combined queries in descending order of the correlation coefficient. Results: In validation set 1, 13 cumulative query methods were applied, and 8 had higher correlation coefficients (min=.916, max=.943) than that of the highest single combined query. Further, 11 of 13 cumulative query methods had an r value of ≥.7, but 4 of 13 combined queries had an r value of ≥.7. In validation set 2, 8 of 15 cumulative query methods showed higher correlation coefficients (min=.975, max=.987) than that of the highest single combined query. All 15 cumulative query methods had an r value of ≥.7, but 6 of 15 combined queries had an r value of ≥.7. Conclusions: Cumulative query method showed relatively higher correlation with national influenza surveillance data than combined queries in the development and validation set. [J Med Internet Res 2014;16(12):e289]
- Subjects :
- Adult
Male
query
Correlation coefficient
Computer science
influenza-like illness
Health Informatics
Google Flu Trends
lcsh:Computer applications to medicine. Medical informatics
syndromic surveillance system
computer.software_genre
Correlation
Set (abstract data type)
Search engine
Influenza, Human
Humans
Local search (constraint satisfaction)
Original Paper
Internet
Web search query
Data collection
lcsh:Public aspects of medicine
Data Collection
lcsh:RA1-1270
Middle Aged
Internet search
United States
Search Engine
Population Surveillance
Quota sampling
lcsh:R858-859.7
Female
Data mining
Centers for Disease Control and Prevention, U.S
influenza
computer
Subjects
Details
- ISSN :
- 14388871
- Volume :
- 16
- Database :
- OpenAIRE
- Journal :
- Journal of Medical Internet Research
- Accession number :
- edsair.doi.dedup.....d3c0fd21f9331d1c91555ffd5be6172c