Abstract

BackgroundInfluenza outbreaks pose major challenges to public health around the world, leading to thousands of deaths a year in the United States alone. Accurate systems that track influenza activity at the city level are necessary to provide actionable information that can be used for clinical, hospital, and community outbreak preparation.ObjectiveAlthough Internet-based real-time data sources such as Google searches and tweets have been successfully used to produce influenza activity estimates ahead of traditional health care–based systems at national and state levels, influenza tracking and forecasting at finer spatial resolutions, such as the city level, remain an open question. Our study aimed to present a precise, near real-time methodology capable of producing influenza estimates ahead of those collected and published by the Boston Public Health Commission (BPHC) for the Boston metropolitan area. This approach has great potential to be extended to other cities with access to similar data sources.MethodsWe first tested the ability of Google searches, Twitter posts, electronic health records, and a crowd-sourced influenza reporting system to detect influenza activity in the Boston metropolis separately. We then adapted a multivariate dynamic regression method named ARGO (autoregression with general online information), designed for tracking influenza at the national level, and showed that it effectively uses the above data sources to monitor and forecast influenza at the city level 1 week ahead of the current date. Finally, we presented an ensemble-based approach capable of combining information from models based on multiple data sources to more robustly nowcast as well as forecast influenza activity in the Boston metropolitan area. The performances of our models were evaluated in an out-of-sample fashion over 4 influenza seasons within 2012-2016, as well as a holdout validation period from 2016 to 2017.ResultsOur ensemble-based methods incorporating information from diverse models based on multiple data sources, including ARGO, produced the most robust and accurate results. The observed Pearson correlations between our out-of-sample flu activity estimates and those historically reported by the BPHC were 0.98 in nowcasting influenza and 0.94 in forecasting influenza 1 week ahead of the current date.ConclusionsWe show that information from Internet-based data sources, when combined using an informed, robust methodology, can be effectively used as early indicators of influenza activity at fine geographic resolutions.

Highlights

  • Traditional Influenza SurveillanceSeasonal influenza is a major public health concern across the United States

  • The observed Pearson correlations between our out-of-sample flu activity estimates and those historically reported by the Boston Public Health Commission (BPHC) were 0.98 in nowcasting influenza and 0.94 in forecasting influenza 1 week ahead of the current date

  • Our study shows that novel influenza surveillance approaches that leverage information from Internet search engines, Twitter posts, self-reporting crowd-sourced influenza reports, and electronic health records (EHRs) can monitor and forecast influenza activity as reported by a well-established metropolitan surveillance system, in near real-time

Read more

Summary

Introduction

Traditional Influenza SurveillanceSeasonal influenza is a major public health concern across the United States. The CDC publishes weekly reports for national and multistate regional incidence, whereas state and city data are sometimes published by local agencies such as the Boston Public Health Commission (BPHC). These systems provide consistent historical information to track ILI levels in the US population [5,6]. The time lag delays knowledge of current influenza activity, limiting the ability for timely response management. Accurate systems that track influenza activity at the city level are necessary to provide actionable information that can be used for clinical, hospital, and community outbreak preparation

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call