Avian Influenza Risk Surveillance in North America with Online Media.

Colin Robertson,Lauren Yee,Dena L Schanzer

doi:10.1371/journal.pone.0165688

Abstract

The use of Internet-based sources of information for health surveillance applications has increased in recent years, as a greater share of social and media activity happens through online channels. The potential surveillance value in online sources of information about emergent health events include early warning, situational awareness, risk perception and evaluation of health messaging among others. The challenge in harnessing these sources of data is the vast number of potential sources to monitor and developing the tools to translate dynamic unstructured content into actionable information. In this paper we investigated the use of one social media outlet, Twitter, for surveillance of avian influenza risk in North America. We collected AI-related messages over a five-month period and compared these to official surveillance records of AI outbreaks. A fully automated data extraction and analysis pipeline was developed to acquire, structure, and analyze social media messages in an online context. Two methods of outbreak detection; a static threshold and a cumulative-sum dynamic threshold; based on a time series model of normal activity were evaluated for their ability to discern important time periods of AI-related messaging and media activity. Our findings show that peaks in activity were related to real-world events, with outbreaks in Nigeria, France and the USA receiving the most attention while those in China were less evident in the social media data. Topic models found themes related to specific AI events for the dynamic threshold method, while many for the static method were ambiguous. Further analyses of these data might focus on quantifying the bias in coverage and relation between outbreak characteristics and detectability in social media data. Finally, while the analyses here focused on broad themes and trends, there is likely additional value in developing methods for identifying low-frequency messages, operationalizing this methodology into a comprehensive system for visualizing patterns extracted from the Internet, and integrating these data with other sources of information such as wildlife, environment, and agricultural data.

Highlights

Surveillance systems are essential to detect early warning signals in animal and human health and inform management strategies for populations at risk
The analysis here provides a synoptic view of using social media as a tool for sensing online media related to AI
We have demonstrated a fully automated methodology to acquire, structure, model and analyze social media data related to AI at a global scale

Summary

Objectives

Our aim was to produce an online-capable set of methods that analyzed data as it arrived, rather than a purely retrospective analysis of the dataset. We aimed to identify biases and characteristic in social media data relative to the OIE data

Methods

Results

Conclusion