Abstract

Seasonal influenza epidemics causes severe illnesses and 250,000 to 500,000 deaths worldwide each year. Other pandemics like the 1918 “Spanish Flu” may change into a devastating one. Reducing the impact of these threats is of paramount importance for health authorities, and studies have shown that effective interventions can be taken to contain the epidemics, if early detection can be made. In this paper, we introduce the Social Network Enabled Flu Trends (SNEFT), a continuous data collection framework which monitors flu related tweets and track the emergence and spread of an influenza. We show that text mining significantly enhances the correlation between the Twitter and the Influenza like Illness (ILI) rates provided by Centers for Disease Control and Prevention (CDC). For accurate prediction, we implemented an auto-regression with exogenous input (ARX) model which uses current Twitter data, and CDC ILI rates from previous weeks to predict current influenza statistics. Our results show that, while previous ILI data from CDC offer a true (but delayed) assessment of a flu epidemic, Twitter data provides a real-time assessment of the current epidemic condition and can be used to compensate for the lack of current ILI data. We observe that the Twitter data is highly correlated with the ILI rates across different regions within USA and can be used to effectively improve the accuracy of our prediction. Our age-based flu prediction analysis indicates that for most of the regions, Twitter data best fit the age groups of 5-24 and 25-49 years, correlating well with the fact that these are likely, the most active user age groups on Twitter. Therefore, Twitter data can act as supplementary indicator to gauge influenza within a population and helps discovering flu trends ahead of CDC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call