Abstract

Anti-vaccination attitudes have been an issue since the development of the first vaccines. The increasing use of social media as a source of health information may contribute to vaccine hesitancy due to anti-vaccination content widely available on social media, including Twitter. Being able to identify anti-vaccination tweets could provide useful information for formulating strategies to reduce anti-vaccination sentiments among different groups. This study aims to evaluate the performance of different natural language processing models to identify anti-vaccination tweets that were published during the COVID-19 pandemic. We compared the performance of the bidirectional encoder representations from transformers (BERT) and the bidirectional long short-term memory networks with pre-trained GLoVe embeddings (Bi-LSTM) with classic machine learning methods including support vector machine (SVM) and naïve Bayes (NB). The results show that performance on the test set of the BERT model was: accuracy = 91.6%, precision = 93.4%, recall = 97.6%, F1 score = 95.5%, and AUC = 84.7%. Bi-LSTM model performance showed: accuracy = 89.8%, precision = 44.0%, recall = 47.2%, F1 score = 45.5%, and AUC = 85.8%. SVM with linear kernel performed at: accuracy = 92.3%, Precision = 19.5%, Recall = 78.6%, F1 score = 31.2%, and AUC = 85.6%. Complement NB demonstrated: accuracy = 88.8%, precision = 23.0%, recall = 32.8%, F1 score = 27.1%, and AUC = 62.7%. In conclusion, the BERT models outperformed the Bi-LSTM, SVM, and NB models in this task. Moreover, the BERT model achieved excellent performance and can be used to identify anti-vaccination tweets in future studies.

Highlights

  • Vaccination is one of the most important public health achievements that save millions of lives annually and helps reduce the incidence of many infectious diseases, including eradicating smallpox [1]

  • This study aims to evaluate the performance of different natural language processing models to identify anti-vaccination tweets that were published during the COVID-19 pandemic with the main focus on the bidirectional long short-term memory networks with GLoVe embeddings [25] (Bi-LSTM) and bidirectional encoder representations from transformers (BERT)

  • The BERT model outperformed the other models with an F1 score of 95.5% which is more than two times higher than the Bi-LSTM model (45.5%) and three times higher than the support vector machine (SVM) with the linear kernel (31.2%) and the complement naïve Bayes (NB)

Read more

Summary

Introduction

Vaccination is one of the most important public health achievements that save millions of lives annually and helps reduce the incidence of many infectious diseases, including eradicating smallpox [1]. Anti-vaccination attitudes still exist in the population. A study by the American Academy of Pediatrics showed that 74% of pediatricians encountered a parent who declined or postponed at least one vaccine in a 12-month period [2]. The prevalence of non-medical vaccination exemption has increased in the last two decades, especially in states with less strict exemption criteria in the U.S [3]. Vaccine hesitancy was named as one of the top ten threats to global health by the World Health. Organisation in 2019 [4]. During the COVID pandemic, resulting in more than 120 million infections, 2.66 million deaths (as of 17 March 2021), and the development of safe and

Objectives
Methods
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call