Detecting influenza epidemics using search engine query data

@article{Ginsberg2009DetectingIE,
  title={Detecting influenza epidemics using search engine query data},
  author={Jeremy Ginsberg and Matthew Mohebbi and Rajan Patel and Lynnette Brammer and Mark S Smolinski and Larry Brilliant},
  journal={Nature},
  year={2009},
  volume={457},
  pages={1012-1014}
}
Seasonal influenza epidemics are a major public health concern, causing tens of millions of respiratory illnesses and 250,000 to 500,000 deaths worldwide each year. In addition to seasonal influenza, a new strain of influenza virus against which no previous immunity exists and that demonstrates human-to-human transmission could result in a pandemic with millions of fatalities. Early detection of disease activity, when followed by a rapid response, can reduce the impact of both seasonal and… 
Monitoring Seasonal Influenza Epidemics in Korea through Query Search
TLDR
This research aims to develop regression models for early detecting the outbreak of the seasonal influenza epidemics in Korea with keyword query information provided from the Naver (Korean representative portal site) trend services for PC and mobile device.
Use of daily Internet search query data improves real-time projections of influenza epidemics
TLDR
This study combines a previously developed calibration and prediction framework with an established humidity-based transmission dynamic model to forecast influenza and finds that both the earlier availability and the finer temporal resolution are important for increasing forecasting performance.
Predicting the Spread of Pandemic Influenza based on Air Traffic Data and Social Media
TLDR
A mathematical model for the spread of influenza by air travel is created, which is implemented as a public API and correlates to a reasonable degree with Rvachev and Longini's original results, and has been tested on more recent data.
Twitter Improves Seasonal Influenza Prediction
TLDR
The Social Network Enabled Flu Trends (SNEFT), a continuous data collection framework which monitors flu related tweets and track the emergence and spread of an influenza, is introduced and it is observed that the Twitter data is highly correlated with the ILI rates across different regions within USA and can be used to effectively improve the accuracy of the prediction.
Development of a Real-Time Estimate of Flu Activity in the United States Using Dynamically Updated Lasso Regressions and Google Search Queries
TLDR
Results show that improving the underlying regression model makes substantial and long-term improvements to ILI surveillance using query-based methods, rendering the update to GFT’s database of search queries unnecessary.
Towards detecting influenza epidemics by analyzing Twitter messages
TLDR
This paper analyzes messages posted on the micro-blogging site Twitter.com to propose several methods to identify influenza-related messages and compare a number of regression models to correlate these messages with CDC statistics.
Estimating Influenza Outbreaks Using Both Search Engine Query Data and Social Media Data in South Korea
TLDR
A methodological extension for detecting influenza outbreaks using search query data is described and a new approach for query selection through the exploration of contextual information gleaned from social media data is provided, demonstrating the feasibility of using search queries to enhance influenza surveillance in South Korea.
Predicting Flu Trends using Twitter data
TLDR
The Social Network Enabled Flu Trends (SNEFT) framework is presented, which monitors messages posted on Twitter with a mention of flu indicators to track and predict the emergence and spread of an influenza epidemic in a population.
Use of Hangeul Twitter to Track and Predict Human Influenza Infection
TLDR
This study has examined the use of information embedded in the Hangeul Twitter stream to detect rapidly evolving public awareness or concern with respect to influenza transmission and developed regression models that can track levels of actual disease activity and predict influenza epidemics in the real world.
Early and Real-Time Detection of Seasonal Influenza Onset
TLDR
By combining official Influenza-Like Illness incidence rates, searches for ILI-related terms on Google, and an on-call triage phone service, this work was able to identify the beginning of the flu season in 8 European countries, anticipating current official alerts by several weeks.
...
...

References

SHOWING 1-10 OF 25 REFERENCES
Web Queries as a Source for Syndromic Surveillance
TLDR
It is found that certain web queries on influenza follow the same pattern as that obtained by the two other surveillance systems for influenza epidemics, and that they have equal power for the estimation of the influenza burden in society.
Using internet searches for influenza surveillance.
TLDR
This work counted daily unique queries originating in the United States that contained influenza-related search terms from the Yahoo! search engine from March 2004 through May 2008, and estimated linear models, using searches with 1-10-week lead times as explanatory variables to predict the percentage of cultures positive for influenza and deaths attributable to pneumonia and influenza in the US.
Infodemiology: Tracking Flu-Related Searches on the Web for Syndromic Surveillance
TLDR
Systematically collecting and analyzing health information demand data from the Internet has considerable potential to be used for syndromic surveillance.
Containing Pandemic Influenza at the Source
TLDR
Investigation of the effectiveness of targeted antiviral prophylaxis, quarantine, and pre-vaccination in containing an emerging influenza strain at the source showed that a prepared response with targeted antivirals would have a high probability of containing the disease.
Strategies for containing an emerging influenza pandemic in Southeast Asia
TLDR
A simulation model of influenza transmission in Southeast Asia is used and it is shown that elimination of a nascent pandemic may be feasible using a combination of geographically targeted prophylaxis and social distancing measures, if the basic reproduction number of the new virus is below 1.8.
Analysis of Web Access Logs for Surveillance of Influenza
TLDR
There was a moderately strong correlation between the frequency of influenza-related article accesses and the CDC's traditional surveillance data, but the results on timeliness were inconclusive.
Telephone Triage: A Timely Data Source for Surveillance of Influenza-like Diseases
TLDR
Emergency room TT calls are one to five weeks ahead of surveillance data collected by the CDC, and the timeliness of the TT data with influenza surveillance data from the Centers for Disease Control is compared using the cross correlation function.
Evaluation of Over-the-Counter Pharmaceutical Sales As a Possible Early Warning Indicator of Human Disease
TLDR
Results indicate about a 90% correlation between OTC sales and physician diagnoses of acute respiratory conditions, and the sales in question tend to occur approximately 3 days prior to the physician–patient encounters.
MapReduce: simplified data processing on large clusters
TLDR
This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.
The moments of the z and F distributions.
...
...