Abstract

The paper aims to leverage the highly unstructured user-generated content in the context of pollen allergy surveillance using neural networks with character embeddings and the attention mechanism. Currently, there is no accurate representation of hay fever prevalence, particularly in real-time scenarios. Social media serves as an alternative to extract knowledge about the condition, which is valuable for allergy sufferers, general practitioners, and policy makers. Despite tremendous potential offered, conventional natural language processing methods prove limited when exposed to the challenging nature of user-generated content. As a result, the detection of actual hay fever instances among the number of false positives, as well as the correct identification of non-technical expressions as pollen allergy symptoms poses a major problem. We propose a deep architecture enhanced with character embeddings and neural attention to improve the performance of hay fever-related content classification from Twitter data. Improvement in prediction is achieved due to the character-level semantics introduced, which effectively addresses the out-of-vocabulary problem in our dataset where the rate is approximately 9%. Overall, the study is a step forward towards improved real-time pollen allergy surveillance from social media with state-of-art technology.

Highlights

  • One in five Australians suffered from hay fever in between 2014 and 2015 [1]

  • According to the World Health Organization [35], pollen allergy will only increase in prevalence and severity over the decade, which leads to a global concern

  • Twitter data has been proven to be a valuable source of information on emerging symptoms as well as treatments usage from directly affected individuals

Read more

Summary

Introduction

One in five Australians suffered from hay fever in between 2014 and 2015 [1]. According to the World Health Organization [35], pollen allergy will only increase in prevalence and severity over the decade, which leads to a global concern. The accurate estimates of hay fever remain the top priority for Australian Institute of Health and Welfare. The substantial time lag in the results reporting as well as the insufficient data granularity do not allow to obtain the accurate representation of pollen allergy prevalence and severity in real-time. Social media data mining for public health surveillance has been growing in popularity in the research communities to account for the limitations of the existing methods. Below presents the examples of relevant versus non-relevant posts, despite similar wording causing confusion in classification

Objectives
Methods
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.