Abstract

Twitter offers extensive and valuable information on the spread of COVID-19 and the current state of public health. Mining tweets could be an important supplement for public health departments in monitoring the status of COVID-19 in a timely manner and taking the appropriate actions to minimize its impact. Identifying personal health mentions (PHM) is the first step of social media public health surveillance. It aims to identify whether a person’s health condition is mentioned in a tweet, and it serves as a crucial method in tracking pandemic conditions in real time. However, social media texts contain noise, many creative and novel phrases, sarcastic emoji expressions, and misspellings. In addition, the class imbalance issue is usually very serious. To address these challenges, we built a COVID-19 PHM dataset containing more than 11,000 annotated tweets, and we proposed a dual convolutional neural network (CNN) framework using this dataset. An auxiliary CNN in the dual CNN structure provides supplemental information for the primary CNN in order to detect PHMs from tweets more effectively. The experiment shows that the proposed structure could alleviate the effect of class imbalance and could achieve promising results. This automated approach could monitor public health in real time and save disease-prevention departments from the tedious manual work in public health surveillance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call