Abstract

Sentiment analysis is the extraction and categorization of sentiments that have been expressed in text data using text analysis techniques. Manifested by earlier studies, sentiment analysis of drug reviews has a large potential for providing valuable insights to assist healthcare professionals and companies for evaluating the safety of drugs after it has been marketed. Such insights help safeguard patients and increase their trust in medical companies. The existing systems either follow a lexicon-based approach or a learning-based approach for sentiment analysis in the medical domain. Learning-based techniques require annotated data while lexicon-based techniques tend to be domain-specific which restricts their wide use. This research embarks on a hybrid technique that utilizes both learning-based and lexicon-based approaches to achieve better results. General-purpose sentiment lexicons, such as AFFIN, TextBlob, and VADER, are used for annotating the reviews. Furthermore, several feature engineering techniques, such as term frequency (TF), term frequency-inverse document frequency (TF-IDF), and union of TF and TF-IDF (TF U TF-IDF) have been incorporated for the extraction of useful features. Finally, the learning models including logistic regression (LR), AdaBoost classifier (AB), random forest (RF), extra tree classifier (ETC), and multilayer perceptron (MLP) are used to classify sentiments of the reviews. The performance of the proposed hybrid approach is evaluated using accuracy, precision, recall, and F1-score. Experimental results indicate that the combination of learning-based and lexicon-based approaches provide improved results than their individual use. Moreover, TextBlob has shown promising results giving an accuracy of 96% with MLP when used with TF-IDF and with LR when used with TF U TF-IDF.

Highlights

  • D RUG safety is a topic of great interest nowadays considering the approval of new medicines or crossexamining the possibility of a drug withdrawal from the market [1]

  • The effectiveness of TextBlob, valence aware dictionary and sentiment reasoner (VADER), and AFFIN is evaluated as a data annotation approach for drug reviews

  • Two traditional approaches including term frequency (TF) and term frequency-inverse document frequency (TF-Inverse document frequency (IDF)) and one modified approach of uniting TF and TFIDF are used on the annotated data for feature extraction

Read more

Summary

Introduction

D RUG safety is a topic of great interest nowadays considering the approval of new medicines or crossexamining the possibility of a drug withdrawal from the market [1]. Before its approval, a drug is first tested and evaluated, many adverse effects are seen after its release in the market and use by a large number of people [2]. Accurate and effective evaluation of a drug has significant importance for companies and general people, after it has been launched in the market. The wide use of web platforms paved the way to acquire opinions, views, and suggestions of general people regarding various products and services to which medical products are no exception. The internet users participate in healthcare web forums to share their experiences of various drugs, diagnosis, treatments, and allergic reactions to seek guidance from medical experts [3]. As well as, physicians write their reviews about the use of a particular drug in specific circumstances

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call