Predictive modelling of stigmatized behaviour in vaccination discussions on Facebook

Nadiya Straton,Ravi Vatrapu,Raghava Rao Mukkamala,Raymond Ng,Hyeju Jang

doi:10.1109/bibm47256.2019.8983175

Abstract

Facebook often serves as a platform for sharing health-related information and is a venue to express attitudes, thoughts, and frustrations within groups centered around healthcare themes. This information can be utilized for public health monitoring, with the aim of tackling stigmatized and stereotypical attitudes in relation to immunization or other health related issues expressed in social media. However, the effectiveness of those attempts will rest on our understanding of the concept of stigma and its correct modeling. In this study, we aim to expand the small pool of existing computational studies on the topic of stigma identification in a health care context. More specifically, we compare the following models using a dataset of 2,761 comments from Facebook: Convolutional Neural Network (CNN): Term Frequency-Inverse Document Frequency (TF-IDF) with Logistic Regression (LR), Support Vector Machine (SVM), Naive Bayes (NB), Multilayer Perceptron (MLP), Random Forest (RF), K-nearest neighbours (KNN), and Stochastic Gradient Descent (SGDC), Long short-term memory networks (LSTM), Bidirectional long short-term memory (BiLSTM), and fastText. Accuracy results as evaluated on an unbalanced data subset (with limited training samples) show that fastText gives the best performance, although BiLSTM and CNN achieve comparably good results on unbalanced data as well. CNN algorithm significantly outperforms other algorithms on balanced version of the dataset according to a paired sample t-test ( $p ).

Full Text