Abstract

BackgroundAs suggested as early as in 2006, logs of queries submitted to search engines seeking information could be a source for detection of emerging influenza epidemics if changes in the volume of search queries are monitored (infodemiology). However, selecting queries that are most likely to be associated with influenza epidemics is a particular challenge when it comes to generating better predictions.ObjectiveIn this study, we describe a methodological extension for detecting influenza outbreaks using search query data; we provide a new approach for query selection through the exploration of contextual information gleaned from social media data. Additionally, we evaluate whether it is possible to use these queries for monitoring and predicting influenza epidemics in South Korea.MethodsOur study was based on freely available weekly influenza incidence data and query data originating from the search engine on the Korean website Daum between April 3, 2011 and April 5, 2014. To select queries related to influenza epidemics, several approaches were applied: (1) exploring influenza-related words in social media data, (2) identifying the chief concerns related to influenza, and (3) using Web query recommendations. Optimal feature selection by least absolute shrinkage and selection operator (Lasso) and support vector machine for regression (SVR) were used to construct a model predicting influenza epidemics.ResultsIn total, 146 queries related to influenza were generated through our initial query selection approach. A considerable proportion of optimal features for final models were derived from queries with reference to the social media data. The SVR model performed well: the prediction values were highly correlated with the recent observed influenza-like illness (r=.956; P<.001) and virological incidence rate (r=.963; P<.001).ConclusionsThese results demonstrate the feasibility of using search queries to enhance influenza surveillance in South Korea. In addition, an approach for query selection using social media data seems ideal for supporting influenza surveillance based on search query data.

Highlights

  • An early and well-known example of utilizing Internet data for a health-related applications came from the estimation of influenza incidence using anonymous logs of Web search engine queries

  • The support vector machine for regression (SVR) model performed well: the prediction values were highly correlated with the recent observed influenza-like illness (r=.956; P

  • A total of 146 queries related to influenza were generated through our initial query selection approach

Read more

Summary

Introduction

An early and well-known example of utilizing Internet data for a health-related applications came from the estimation of influenza incidence using anonymous logs of Web search engine queries. [4], Baidu [5], or other medical websites [6] and traditional data used for influenza surveillance, such as influenza-like illness (ILI) and/or laboratory-confirmed data These studies indicate that individuals faced with disease or ill health will search for information on the Internet regarding their state of health and possible countermeasures to illness; logs of queries submitted to search engines by individuals seeking this information are potential sources of information for detecting emerging epidemics, as it is possible to track changes in the volumes of specific search queries. Social media data have been highlighted as an additional potential data source for disease surveillance because they contain a greater variety of contextual health information with diverse descriptions of health states It could be a useful reference point for researchers who wish to select initial target queries in query-based prediction. Selecting queries that are most likely to be associated with influenza epidemics is a particular challenge when it comes to generating better predictions

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.