Abstract

Effective and timely disease surveillance systems have the potential to help public health officials design interventions to mitigate the effects of disease outbreaks. Currently, healthcare-based disease monitoring systems in France offer influenza activity information that lags real-time by one to three weeks. This temporal data gap introduces uncertainty that prevents public health officials from having a timely perspective on the population-level disease activity. Here, we present a machine-learning modeling approach that produces real-time estimates and short-term forecasts of influenza activity for the twelve continental regions of France by leveraging multiple disparate data sources that include, Google search activity, real-time and local weather information, flu-related Twitter micro-blogs, electronic health records data, and historical disease activity synchronicities across regions. Our results show that all data sources contribute to improving influenza surveillance and that machine-learning ensembles that combine all data sources lead to accurate and timely predictions.

Highlights

  • Influenza is a major public health problem causing up to five million severe cases and 500,000 deaths per year worldwide [1,2,3]

  • As shown we found that all external data sources improve flu estimates, specially electronic health records (EHR) Data and Google Data

  • The 90% confidence interval (CI) of the blackbest root mean squared error (RMSE) is [42.97;63.08] with a median value equal to 56.08 These values are mostly obtained with ARGONet model which implies a reduction of the error from 15% to 41% compared to the baseline

Read more

Summary

Introduction

Influenza is a major public health problem causing up to five million severe cases and 500,000 deaths per year worldwide [1,2,3]. With the motivation to alleviate this time delay, mathematical modeling and machine learning approaches have been proposed to produce disease estimates in real time and ahead of healthcare-based surveillance systems in multiple nations around the world Most of these studies have been designed and tested in developed nations, such as the United States and France, where information on disease outbreaks has been collected historically for decades [2]. One of the first and most prominent studies on the use of internet data for monitoring influenza epidemics is Google Flu Trends (GFT) [23, 24] This web-based platform, created in 2009 and designed and deployed by Google, used the volume of selected Google search terms to estimate ILI activity in real time. Near real-time estimates as well as oneand two-week ahead forecasts are presented

Materials and methods
Evaluation
Evaluation of data sources as predictors
Evaluation of statistical models
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call