Abstract

Sentiment analysis refers to the automatic collection, aggregation, and classification of data collected online into different emotion classes. While most of the work related to sentiment analysis of texts focuses on the binary and ternary classification of these data, the task of multi-class classification has received less attention. Multi-class classification has always been a challenging task given the complexity of natural languages and the difficulty of understanding and mathematically quantifying how humans express their feelings. In this paper, we study the task of multi-class classification of online posts of Twitter users, and show how far it is possible to go with the classification, and the limitations and difficulties of this task. The proposed approach of multi-class classification achieves an accuracy of 60.2% for 7 different sentiment classes which, compared to an accuracy of 81.3% for binary classification, emphasizes the effect of having multiple classes on the classification performance. Nonetheless, we propose a novel model to represent the different sentiments and show how this model helps to understand how sentiments are related. The model is then used to analyze the challenges that multi-class classification presents and to highlight possible future enhancements to multi-class classification accuracy.

Highlights

  • Over the recent years, increasing attention has been paid to the analysis of data collected from social networks and microblogging websites

  • Due to the wide use of hashtags, companies can trace “tweets” (i.e., Big Data Mining and Analytics, September 2019, 2(3): 181–194 messages posted by Twitter users) that deal with their own products or services

  • We propose a new model to represent sentiments, and use it to show the relationships between the different sentiments and to explain why the task of multi-class sentiment analysis is inherently difficult

Read more

Summary

Introduction

Over the recent years, increasing attention has been paid to the analysis of data collected from social networks and microblogging websites. This is because people tend to discuss all sorts of topics using these services; topics that might include their daily affairs and plans, and some services or products they are using. Due to the wide use of hashtags, companies can trace “tweets” (i.e., Big Data Mining and Analytics, September 2019, 2(3): 181–194 messages posted by Twitter users) that deal with their own products or services

Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call