Leveraging distant supervision and deep learning for twitter sentiment and emotion classification

Muhamet Kastrati,Zenun Kastrati,Ali Shariq Imran,Marenglen Biba

doi:10.1007/s10844-024-00845-0

Muhamet Kastrati, Zenun Kastrati + Show 2 more

Open Access

PDF Available

https://doi.org/10.1007/s10844-024-00845-0

Copy DOI

Export

Save

Cite

Journal: Journal of Intelligent Information Systems	Publication Date: Mar 22, 2024
Citations: 4	License type: CC BY 4.0

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Nowadays, various applications across industries, healthcare, and security have begun adopting automatic sentiment analysis and emotion detection in short texts, such as posts from social media. Twitter stands out as one of the most popular online social media platforms due to its easy, unique, and advanced accessibility using the API. On the other hand, supervised learning is the most widely used paradigm for tasks involving sentiment polarity and fine-grained emotion detection in short and informal texts, such as Twitter posts. However, supervised learning models are data-hungry and heavily reliant on abundant labeled data, which remains a challenge. This study aims to address this challenge by creating a large-scale real-world dataset of 17.5 million tweets. A distant supervision approach relying on emojis available in tweets is applied to label tweets corresponding to Ekman’s six basic emotions. Additionally, we conducted a series of experiments using various conventional machine learning models and deep learning, including transformer-based models, on our dataset to establish baseline results. The experimental results and an extensive ablation analysis on the dataset showed that BiLSTM with FastText and an attention mechanism outperforms other models in both classification tasks, achieving an F1-score of 70.92% for sentiment classification and 54.85% for emotion detection.

Full Text