Adaptive multitask emotion recognition and sentiment analysis using resource-constrained MobileBERT and DistilBERT: an efficient approach for edge devices
Emotion recognition and sentiment analysis are crucial tasks in natural language processing, enabling machines to understand human emotions and opinions. However, the complex, nuanced relationship between emotions and sentiment in conversation poses significant challenges to accurate emotion recognition, as sentiment cues can be easily misinterpreted. Deploying emotion recognition and sentiment analysis tasks on edge devices poses substantial challenges due to computational resource constraints. We present an adaptive multitask learning approach that jointly leverages resource-constrained Mobile Bidirectional Encoder Representations from Transformers (MobileBERT) and Distilled BERT (DistilBERT) models to optimise emotion recognition and sentiment analysis. Our proposed approach utilises prototypical networks to learn effective representations of emotions and sentiment, while a focal weighted loss function effectively mitigates the class imbalance. We adaptively fine-tune the learning process to balance task importance and resource utilisation, resulting in better performance and efficiency. Our experimental results demonstrate the efficacy of our method, achieving the best results on MELD and IEMOCAP benchmark datasets while keeping a compact model size. Despite limited computational demands, our solution demonstrates that emotion and sentiment analysis can deliver performance comparable to resource-intensive large language models (LLMs). Facilitating various applications in human-computer interaction, affective computing, social media, dialogue conversion, and healthcare.
- Dissertation
- 10.4995/thesis/10251/172164
- Sep 2, 2021
In the last years, Deep Learning (DL) has revolutionised the potential of automatic systems that handle Natural Language Processing (NLP) tasks. We have witnessed a tremendous advance in the performance of these systems. Nowadays, we found embedded systems ubiquitously, determining the intent of the text we write, the sentiment of our tweets or our political views, for citing some examples. In this thesis, we proposed several NLP models for addressing tasks that deal with social media text. Concretely, this work is focused mainly on Sentiment Analysis and Personality Recognition tasks. Sentiment Analysis is one of the leading problems in NLP, consists of determining the polarity of a text, and it is a well-known task where the number of resources and models proposed is vast. In contrast, Personality Recognition is a breakthrough task that aims to determine the users' personality using their writing style, but it is more a niche task with fewer resources designed ad-hoc but with great potential. Despite the fact that the principal focus of this work was on the development of Deep Learning models, we have also proposed models based on linguistic resources and classical Machine Learning models. Moreover, in this more straightforward setup, we have explored the nuances of different language devices, such as the impact of emotions in the correct classification of the sentiment expressed in a text. Afterwards, DL models were developed, particularly Convolutional Neural Networks (CNNs), to address previously described tasks. In the case of Personality Recognition, we explored the two approaches, which allowed us to compare the models under the same circumstances. Noteworthy, NLP has evolved dramatically in the last years through the development of public evaluation campaigns, where multiple research teams compare the performance of their approaches under the same conditions. Most of the models here presented were either assessed in an evaluation task or either used their setup. Recognising the importance of this effort, we curated and developed an evaluation campaign for classifying political tweets. In addition, as we advanced in the development of this work, we decided to study in-depth CNNs applied to NLP tasks. Two lines of work were explored in this regard. Firstly, we proposed a semantic-based padding method for CNNs, which addresses how to represent text more appropriately for solving NLP tasks. Secondly, a theoretical framework was introduced for tackling one of the most frequent critics of Deep Learning: interpretability. This framework seeks to visualise what lexical patterns, if any, the CNN is learning in order to classify a sentence. In summary, the main achievements presented in this thesis are: - The organisation of an evaluation campaign for Topic Classification from texts gathered from social media. - The proposal of several Machine Learning models tackling the Sentiment Analysis task from social media. Besides, a study of the impact of linguistic devices such as figurative language in the task is presented. - The development of a model for inferring the personality of a developer provided the source code that they have written. - The study of Personality Recognition tasks from social media following two different approaches, models based on machine learning algorithms and handcrafted features, and models based on CNNs were proposed and compared both approaches. - The introduction of new semantic-based paddings for optimising how the text was represented in CNNs. - The definition of a theoretical framework to provide interpretable information to what CNNs were learning internally.
- Research Article
58
- 10.1038/s41598-024-60210-7
- Apr 26, 2024
- Scientific Reports
Sentiment analysis is an essential task in natural language processing that involves identifying a text’s polarity, whether it expresses positive, negative, or neutral sentiments. With the growth of social media and the Internet, sentiment analysis has become increasingly important in various fields, such as marketing, politics, and customer service. However, sentiment analysis becomes challenging when dealing with foreign languages, particularly without labelled data for training models. In this study, we propose an ensemble model of transformers and a large language model (LLM) that leverages sentiment analysis of foreign languages by translating them into a base language, English. We used four languages, Arabic, Chinese, French, and Italian, and translated them using two neural machine translation models: LibreTranslate and Google Translate. Sentences were then analyzed for sentiment using an ensemble of pre-trained sentiment analysis models: Twitter-Roberta-Base-Sentiment-Latest, bert-base-multilingual-uncased-sentiment, and GPT-3, which is an LLM from OpenAI. Our experimental results showed that the accuracy of sentiment analysis on translated sentences was over 86% using the proposed model, indicating that foreign language sentiment analysis is possible through translation to English, and the proposed ensemble model works better than the independent pre-trained models and LLM.
- Research Article
- 10.1142/s0218348x24400565
- Sep 18, 2024
- Fractals
Sentiment analysis is a vital task in natural language processing (NLP) that aims to identify and extract the emotional states and opinions of text. In this study, we conduct a comprehensive comparison of large language models (LLMs), such as ChatGPT and Google Bard, with conventional methods in sentiment analysis. We employ a rigorous evaluation framework that covers four essential metrics: accuracy, precision, recall, and the [Formula: see text]1-score. Our results reveal that TextBlob outperforms other methods, achieving an impressive accuracy of 69% and precision of 83%. On the other hand, Bard shows a relatively poor performance, with only 39% accuracy and 46% precision. This study offers valuable insights into the diverse capabilities of AI models in sentiment analysis. A key finding of this study is the importance of model selection according to the specific requirements of the task. Each model has its own strengths and weaknesses, which are reflected in their performance profiles. Moreover, the context in which these models operate is crucial. For instance, ChatGPT generates varied responses, Bard struggles with multiple sentences, and Robustly Optimized BERT Pretraining Approach (RoBERTa) balances precision and recall. This study also reveals the performance gap between LLMs and state-of-the-art deep learning methods. We believe this work will inspire future research and applications of ChatGPT and similar AI models in sentiment analysis and related tasks.
- Research Article
16
- 10.1016/j.knosys.2023.111148
- Nov 2, 2023
- Knowledge-Based Systems
Efficient utilization of pre-trained models: A review of sentiment analysis via prompt learning
- Research Article
- 10.31127/tuje.1698748
- Jul 9, 2025
- Turkish Journal of Engineering
Sentiment analysis (SA) is an influential task in natural language processing that aims to understand and categorize the underlying sentiment expressed in text. Due to the fast growth of technology, social media is becoming more familiar in human daily life. Social media is a platform for people to share and express their opinions, experiences, attitudes, reactions, etc. The purpose of sentiment analysis is to identify whether the emotion conveyed in a classified text is positive, negative, neutral, or any other individual sentiment to understand the emotional context of the text. Deep learning techniques have shown remarkable performance in sentiment analysis tasks, outperforming traditional machine learning algorithms. This article presents a comparative analysis of three deep learning models, including multilayer perceptron (MLP), 1-dimensional convolutional neural networks (1D-CNN), and long short-term memory (LSTM) networks, for sentiment analysis of social media contents (SMC). The experiments are conducted on publicly available benchmark datasets of US airlines (sentiment tweets) for binary and ternary classes. Likewise, we explore the impact of various pre-processing techniques, such as punctuation elimination, erasing special symbols, stop word removal, strange word removal, converting a lowercase, stemming, lemmatization, and tokenization in improving the performance of deep learning models for sentiment analysis. The results demonstrate that the LSTM network for binary class dataset achieves a high accuracy rate of 94.67%, F1-S value of 95.26% and a low error rate of 5.33% in sentiment analysis tasks, followed by 1D-CNN and MLP. Besides, the MLP technique gains better results in comparison to other methods for the ternary class datasets. The findings of this study contribute to the existing literature by providing insights into the comparative performance of different deep-learning architectures for sentiment analysis and highlighting the importance of pre-processing techniques in achieving accurate sentiment classification.
- Research Article
- 10.11591/ijeecs.v40.i1.pp490-498
- Oct 1, 2025
- Indonesian Journal of Electrical Engineering and Computer Science
Emotion recognition from text is a crucial task in natural language processing (NLP) with applications in sentiment analysis, human-computer interaction, and psychological research. In this study, we present a novel approach for text-based emotion recognition using a modified firefly algorithm (MFA). The firefly algorithm is a swarm intelligence method inspired by the bioluminescent communication of fireflies, and it is known for its simplicity and efficiency in optimization tasks. In this paper MFAbased model is evaluated on the international survey on emotion antecedents and reactions (ISEAR) dataset, which includes text entries categorized by various emotions. Experimental results indicate that our approach achieved promising outcomes. Specifically, the proposed method, which combines the firefly algorithm with a multilayer perceptron (MLP), attained an accuracy of 92.07%, surpassing most other approaches reported in the literature.
- Research Article
- 10.48084/etasr.10331
- Jun 4, 2025
- Engineering, Technology & Applied Science Research
Large Language Models (LLMs) have shown outstanding performance in many Natural Language Processing (NLP) tasks for high-resource languages, especially English, primarily because most of them were trained on widely available text resources. As a result, many low-resource languages, such as Arabic and African languages and their dialects, are not well studied, raising concerns about whether LLMs can perform fairly across them. Therefore, evaluating the performance of LLMs for low-resource languages and diverse dialects is crucial. This study investigated the performance of LLMs in Moroccan Arabic, a low-resource dialect spoken by approximately 30 million people. The performance of 14 Arabic pre-trained models was evaluated on the Moroccan dialect, employing 11 datasets across various NLP tasks such as text classification, sentiment analysis, and offensive language detection. The evaluation results showed that MARBERTv2 achieved the highest overall average F1-score of 83.47, while the second-best model, DarijaBERT-mix, had an average F1-score of 83.38. These findings provide valuable insights into the effectiveness of current LLMs for low-resource languages, particularly the Moroccan dialect.
- Research Article
395
- 10.1016/j.inffus.2023.101861
- Jun 3, 2023
- Information Fusion
OpenAI has released the Chat Generative Pre-trained Transformer (ChatGPT) and revolutionized the approach in artificial intelligence to human-model interaction. The first contact with the chatbot reveals its ability to provide detailed and precise answers in various areas. Several publications on ChatGPT evaluation test its effectiveness on well-known natural language processing (NLP) tasks. However, the existing studies are mostly non-automated and tested on a very limited scale. In this work, we examined ChatGPT’s capabilities on 25 diverse analytical NLP tasks, most of them subjective even to humans, such as sentiment analysis, emotion recognition, offensiveness, and stance detection. In contrast, the other tasks require more objective reasoning like word sense disambiguation, linguistic acceptability, and question answering. We also evaluated GPT-4 model on five selected subsets of NLP tasks. We automated ChatGPT and GPT-4 prompting process and analyzed more than 49k responses. Our comparison of its results with available State-of-the-Art (SOTA) solutions showed that the average loss in quality of the ChatGPT model was about 25% for zero-shot and few-shot evaluation. For GPT-4 model, a loss for semantic tasks is significantly lower than for ChatGPT. We showed that the more difficult the task (lower SOTA performance), the higher the ChatGPT loss. It especially refers to pragmatic NLP problems like emotion recognition. We also tested the ability to personalize ChatGPT responses for selected subjective tasks via Random Contextual Few-Shot Personalization, and we obtained significantly better user-based predictions. Additional qualitative analysis revealed a ChatGPT bias, most likely due to the rules imposed on human trainers by OpenAI. Our results provide the basis for a fundamental discussion of whether the high quality of recent predictive NLP models can indicate a tool’s usefulness to society and how the learning and validation procedures for such systems should be established.
- Research Article
- 10.1051/itmconf/20257605004
- Jan 1, 2025
- ITM Web of Conferences
Social media platforms have become a significant medium for expressing opinions, emotions, and sentiments, making sentiment analysis a crucial task in Natural Language Processing (NLP). While various sentiment analysis techniques have been proposed, existing studies often face challenges such as language dependency, platform-specific biases, lack of real-time processing, and limited multimodal analysis. This research explores the evolution of sentiment analysis in social media by leveraging cutting-edge NLP techniques, including transformer-based models (BERT, RoBERTa, GPT) and multimodal approaches. By addressing the limitations of previous studies, our research proposes a real-time, multilingual, and cross-platform sentiment analysis model capable of analyzing textual, audio, and visual content from diverse social media platforms (e.g., Twitter, Facebook, Instagram, and TikTok). Additionally, this study investigates the effectiveness of domain-specific sentiment analysis (e.g., political discourse, health-related discussions) to improve sentiment classification in specialized contexts. Benchmark datasets and experimental validation will be used to compare existing sentiment analysis models with our proposed approach. Our findings aim to enhance scalability, accuracy, and real-time adaptability of sentiment analysis in social media applications, ultimately contributing to improved decision-making in social monitoring, brand analysis, and crisis management.
- Research Article
1
- Jan 1, 2024
- AMIA ... Annual Symposium proceedings. AMIA Symposium
Health-related social media data generated by patients and the public provide valuable insights into patient experiences and opinions toward health issues such as vaccination and medical treatments. Using Natural Language Processing (NLP) methods to analyze such data, however, often requires high-quality annotations that are difficult to obtain. The recent emergence of Large Language Models (LLMs) such as the Generative Pre-trained Transformers (GPTs) has shown promising performance on a variety of NLP tasks in the health domain with little to no annotated data. However, their potential in analyzing health-related social media data remains underexplored. In this paper, we report empirical evaluations of LLMs (GPT-3.5-Turbo, FLAN-T5, and BERT-based models) on a common NLP task of health-related social media data: sentiment analysis for identifying opinions toward health issues. We explored how different prompting and fine-tuning strategies affect the performance of LLMs on social media datasets across diverse health topics, including Healthcare Reform, vaccination, mask wearing, and healthcare service quality. We found that LLMs outperformed VADER, a widely used off-the-shelf sentiment analysis tool, but are far from being able to produce accurate sentiment labels. However, their performance can be improved by data-specific prompts with information about the context, task, and targets. The highest performing LLMs are BERT-based models that were fine-tuned on aggregated data. We provide practical tips for researchers to use LLMs on health-related social media data for optimal outcomes. We also discuss future work needed to continue to improve the performance of LLMs for analyzing health-related social media data with minimal annotations.
- Research Article
- 10.6342/ntu.2015.00858
- Jan 1, 2015
The non-literal or non-lexical aspects of communication cannot be interpreted directly and literally. The identification and analysis of real intent beyond literal meanings is a challenging task in natural language processing, especially when working on microtexts such as microblogs that are limited to 140 characters. The recognition and analysis of these components are crucial for many applications including sentiment analysis, opinion mining, question answering and chatterbots. In this study, emotion recognition, online advertising legality identification and verbal irony analysis are examined. In the emotion recognition experiments, the generation of user emotions on a microblogging platform is modeled from both writers’ and readers’ perspectives. Graphic emoticons, which are commonly used to express users’ emotions, serve as emotion labels so that microtext emotion datasets can be constructed. To build classifiers for the emotion identification task, support vector machine (SVM)-based algorithms are adopted. In addition to textual features, non-verbal factors, including social relation, user behavior and relevance degree, are also used as features. The experimental results show that the combination of textual, social and behavioral features can be used to achieve the best emotion-prediction performance. The emotional transitions from the poster to the responder in a conversation are also analyzed and predicted in this study. As online advertising continues to grow, Internet users, advertisers, online advertising platforms and the authorities all have the need to avoid or prevent the issues that false and/or misleading advertisements can potentially cause. Many of these false advertising messages are present in short texts, and their appropriateness cannot be easily interpreted. This problem is addressed by building one-class and two-class classifiers with datasets consisting of short illegal advertising statements published by the government and product descriptions from an online shopping website. The results show that the models using the log relative frequency ratio (logRF) combined with unigrams as features achieve the best performance. The logRF values are also used to mine verb phrases that are typically used in illegal advertisements. These verb phrases can be used as a reference for both the advertisers and the authorities. A web-based false advertisement recognition system was also built in this study using the techniques applied to the above experiments in order to reduce human effort in filtering false advertising messages and help protect Internet users from misleading advertising. In verbal irony, the literal meaning of an utterance can be the opposite of what is actually meant. For simplification, this study focuses on ironic expressions in which negative actual meanings are represented by positive words. Ironic messages in microblogs are infrequent and cannot be identified by simply examining the literal meanings of the words. To construct a Chinese irony corpus, ironic messages are collected from microblogs based on emoticon use, linguistic forms and sentiment polarity through a bootstrapping approach. Five types of irony patterns are found in the collected ironic messages. The structure of ironic expressions is also analyzed, and three types of elements are found to form an ironic expression. A conditional random field (CRF)-based approach is used to automatically identify irony elements and ironic messages and reduce the human effort in the bootstrapping approach of irony pattern discovery.
- Research Article
209
- 10.1145/3649506
- Apr 26, 2024
- ACM Transactions on Knowledge Discovery from Data
This article presents a comprehensive and practical guide for practitioners and end-users working with Large Language Models (LLMs) in their downstream Natural Language Processing (NLP) tasks. We provide discussions and insights into the usage of LLMs from the perspectives of models, data, and downstream tasks. First, we offer an introduction and brief summary of current language models. Then, we discuss the influence of pre-training data, training data, and test data. Most importantly, we provide a detailed discussion about the use and non-use cases of large language models for various natural language processing tasks, such as knowledge-intensive tasks, traditional natural language understanding tasks, generation tasks, emergent abilities, and considerations for specific tasks. We present various use cases and non-use cases to illustrate the practical applications and limitations of LLMs in real-world scenarios. We also try to understand the importance of data and the specific challenges associated with each NLP task. Furthermore, we explore the impact of spurious biases on LLMs and delve into other essential considerations, such as efficiency, cost, and latency, to ensure a comprehensive understanding of deploying LLMs in practice. This comprehensive guide aims to provide researchers and practitioners with valuable insights and best practices for working with LLMs, thereby enabling the successful implementation of these models in a wide range of NLP tasks. A curated list of practical guide resources of LLMs, regularly updated, can be found at https://github.com/Mooler0410/LLMsPracticalGuide . An LLMs evolutionary tree, editable yet regularly updated, can be found at llmtree.ai .
- Research Article
- 10.1038/s41598-025-14016-w
- Aug 5, 2025
- Scientific Reports
Speech is one of the most efficient methods of communication among humans, inspiring advancements in machine speech processing under Natural Language Processing (NLP). This field aims to enable computers to analyze, comprehend, and generate human language naturally. Speech processing, as a subset of artificial intelligence, is rapidly expanding due to its applications in emotion recognition, human-computer interaction, and sentiment analysis. This study introduces a novel algorithm for emotion recognition from speech using deep learning techniques. The proposed model achieves up to a 15% improvement compared to state-of-the-art deep learning methods in speech emotion recognition. It employs advanced supervised learning algorithms and deep neural network architectures, including Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units. These models are trained on labeled datasets to accurately classify emotions such as happiness, sadness, anger, fear, surprise, and neutrality. The research highlights the system’s real-time application potential, such as analyzing audience emotional responses during live television broadcasts. By leveraging advancements in deep learning, the model achieves high accuracy in understanding and predicting emotional states, offering valuable insights into user behavior. This approach contributes to diverse domains, including media analysis, customer feedback systems, and human-machine interaction, showcasing the transformative potential of combining speech processing with neural networks.
- Research Article
- 10.26483/ijarcs.v16i2.7220
- Apr 30, 2025
- international journal of advanced research in computer science
Sentiment analysis is a critical task in Natural Language Processing (NLP) that determines sentiment polarity within textual data. Traditional sentiment analysis primarily focuses on binary classification. However, real-world reviews and social media content often exhibit multiple sentiments within a single sentence. This complexity necessitates Aspect-Based Sentiment Analysis (ABSA), which identifies aspect terms and their corresponding sentiments. Despite advancements, existing ABSA models struggle to capture interdependencies between aspect-opinion pairs, leading to misclassifications in multi-aspect scenarios. To address this, our study proposes enhanced ABSA model which integrates dependency parsing with Large Language Model (LLM)-based learning to incorporate structured semantic knowledge for effective aspect-opinion relationship extraction. The integration of structured feature engineering and domain-specific vocabulary filtering in the proposed work ensures more precise sentiment classification. Experimental evaluations, based on average metrics computed from 5-fold cross validation, demonstrate that the proposed model outperforms existing methods. The model achieves a 3.4% improvement in precision, a 4.9% increase in recall, and a 3.8% boost in F1-score. Additionally, it yields a 5.6% increase in Matthews Correlation Coefficient (MCC), reduces the False Discovery Rate by 3.3%, and lowers the Hamming Loss by 1.7%, ensuring enhanced consistency in multi-aspect sentiment classification. These findings underscore the value of integrating structured semantic knowledge into ABSA, which can significantly enhance the accuracy of sentiment analysis in practical applications.
- Research Article
19
- 10.1145/3593583
- Aug 21, 2023
- ACM Transactions on Information Systems
Sentiment and emotion, which correspond to long-term and short-lived human feelings, are closely linked to each other, leading to the fact that sentiment analysis and emotion recognition are also two interdependent tasks in natural language processing (NLP). One task often leverages the shared knowledge from another task and performs better when solved in a joint learning paradigm. Conversational context dependency, multi-modal interaction, and multi-task correlation are three key factors that contribute to this joint paradigm. However, none of the recent approaches have considered them in a unified framework. To fill this gap, we propose a multi-modal, multi-task interactive graph attention network, termed M3GAT, to simultaneously solve the three problems. At the heart of the model is a proposed interactive conversation graph layer containing three core sub-modules, which are: (1) local-global context connection for modeling both local and global conversational context, (2) cross-modal connection for learning multi-modal complementary and (3) cross-task connection for capturing the correlation across two tasks. Comprehensive experiments on three benchmarking datasets, MELD, MEISD, and MSED, show the effectiveness of M3GAT over state-of-the-art baselines with the margin of 1.88%, 5.37%, and 0.19% for sentiment analysis, and 1.99%, 3.65%, and 0.13% for emotion recognition, respectively. In addition, we also show the superiority of multi-task learning over the single-task framework.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.