Abstract

Traditional methods for monitoring influenza are haphazard and lack fine-grained details regarding the spatial and temporal dynamics of outbreaks. Twitter gives researchers and public health officials an opportunity to examine the spread of influenza in real-time and at multiple geographical scales. In this paper, we introduce an improved framework for monitoring influenza outbreaks using the social media platform Twitter. Relying upon techniques from geographic information science (GIS) and data mining, Twitter messages were collected, filtered, and analyzed for the thirty most populated cities in the United States during the 2013–2014 flu season. The results of this procedure are compared with national, regional, and local flu outbreak reports, revealing a statistically significant correlation between the two data sources. The main contribution of this paper is to introduce a comprehensive data mining process that enhances previous attempts to accurately identify tweets related to influenza. Additionally, geographical information systems allow us to target, filter, and normalize Twitter messages.

Highlights

  • Public health scholars have long studied the ways in which disease outbreaks can be monitored

  • Though remarkable progress has been made in the past few years, there remain significant obstacles facing researchers wanting to use these new sources of data to inform public health decisions

  • Lamb et al [9] used manually-defined word feature classes to filter out tweets that are did not appear to reflect personal infection. Though these efforts are valuable contributions to the fields of public health and Big Data analysis, this paper suggests that a greater reliance on geographic information system (GIS) and machine learning methods can shed new light on the role these exciting new data sources ( Twitter) can play in studying disease outbreak

Read more

Summary

Introduction

Public health scholars have long studied the ways in which disease outbreaks can be monitored. This study builds on the flurry of recent research activity seeking to understand the dynamics of disease outbreak by analyzing Internet and social media data. Many researchers have turned to Google Flu Trends (GFT) for user-generated data on influenza. This site publishes information on flu-related web searches that can be used to monitor changes in Internet activity over time and space. Dugas et al [2] have relied upon historical outbreak data along with patterns in Flu Trends to create a forecast model that allows the team to predict flu cases one week in advance with reasonable precision. Declan Butler has commented in Nature [4] that Google’s significant miscalculation in predicting the intensity of flu outbreaks for the 2013 season casts serious doubt on the reliability of Flu Trends data to monitor illness

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call