Developing an Automatic System for Classifying Chatter About Health Services on Twitter: Case Study for Medicaid.

Yuan-Chi Yang,Whitney Bremer,Abeed Sarker,David Grande,Jane M Zhu,Mohammed Ali Al-Garadi

doi:10.2196/26616

Abstract

BackgroundThe wide adoption of social media in daily life renders it a rich and effective resource for conducting near real-time assessments of consumers’ perceptions of health services. However, its use in these assessments can be challenging because of the vast amount of data and the diversity of content in social media chatter.ObjectiveThis study aims to develop and evaluate an automatic system involving natural language processing and machine learning to automatically characterize user-posted Twitter data about health services using Medicaid, the single largest source of health coverage in the United States, as an example.MethodsWe collected data from Twitter in two ways: via the public streaming application programming interface using Medicaid-related keywords (Corpus 1) and by using the website’s search option for tweets mentioning agency-specific handles (Corpus 2). We manually labeled a sample of tweets in 5 predetermined categories or other and artificially increased the number of training posts from specific low-frequency categories. Using the manually labeled data, we trained and evaluated several supervised learning algorithms, including support vector machine, random forest (RF), naïve Bayes, shallow neural network (NN), k-nearest neighbor, bidirectional long short-term memory, and bidirectional encoder representations from transformers (BERT). We then applied the best-performing classifier to the collected tweets for postclassification analyses to assess the utility of our methods.ResultsWe manually annotated 11,379 tweets (Corpus 1: 9179; Corpus 2: 2200) and used 7930 (69.7%) for training, 1449 (12.7%) for validation, and 2000 (17.6%) for testing. A classifier based on BERT obtained the highest accuracies (81.7%, Corpus 1; 80.7%, Corpus 2) and F1 scores on consumer feedback (0.58, Corpus 1; 0.90, Corpus 2), outperforming the second best classifiers in terms of accuracy (74.6%, RF on Corpus 1; 69.4%, RF on Corpus 2) and F1 score on consumer feedback (0.44, NN on Corpus 1; 0.82, RF on Corpus 2). Postclassification analyses revealed differing intercorpora distributions of tweet categories, with political (400778/628411, 63.78%) and consumer feedback (15073/27337, 55.14%) tweets being the most frequent for Corpus 1 and Corpus 2, respectively.ConclusionsThe broad and variable content of Medicaid-related tweets necessitates automatic categorization to identify topic-relevant posts. Our proposed system presents a feasible solution for automatic categorization and can be deployed and generalized for health service programs other than Medicaid. Annotated data and methods are available for future studies.

Highlights

Consumers’ perspectives and feedback are crucial for improving products or services
Our proposed system presents a feasible solution for automatic categorization and can be deployed and generalized for health service programs other than Medicaid
We found that the bidirectional encoder representations from transformers (BERT) classifier had the highest F1 scores on consumer feedback for both the validation set (0.61) and the test set from Corpus 1 (0.58)

Summary

Introduction

Consumers’ perspectives and feedback are crucial for improving products or services. Over the last two decades, widespread adoption and use of the internet has led to its use as a major platform for collecting targeted consumer feedback. Businesses often allow consumers to rate specific products and services and provide detailed comments or reviews, and this has become a key feature of e-commerce platforms. Do consumers provide comments or seek assistance through these social media accounts, they often engage in discussions about products or services within their own social networks. Such consumer-generated chatter is often used to assess perceptions about specific topics, which may range from products or services to social programs, legislation, and politics. The wide adoption of social media in daily life renders it a rich and effective resource for conducting near real-time assessments of consumers’ perceptions of health services. Its use in these assessments can be challenging because of the vast amount of data and the diversity of content in social media chatter

Methods

Results

Discussion

Conclusion