Sentiment analysis for software engineering: How far can zero-shot learning (ZSL) go?

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Sentiment analysis for software engineering: How far can zero-shot learning (ZSL) go?

Similar Papers
  • Research Article
  • 10.2118/0125-0092-jpt
Zero-Shot Learning With Large Language Models Enhances Drilling-Information Retrieval
  • Jan 1, 2025
  • Journal of Petroleum Technology
  • Chris Carpenter

_ This article, written by JPT Technology Editor Chris Carpenter, contains highlights of paper SPE 217671, “Enhancing Information Retrieval in the Drilling Domain: Zero-Shot Learning With Large Language Models for Question Answering,” by Felix J. Pacis, SPE, University of Stavanger, and Sergey Alyaev and Gilles Pelfrene, SPE, NORCE, et al. The paper has not been peer reviewed. _ Finding information across multiple databases, formats, and documents remains a manual job in the drilling industry. Large language models (LLMs) have proven effective in data-aggregation tasks, including answering questions. However, using LLMs for domain-specific factual responses poses a nontrivial challenge. The expert-labor cost for training domain-specific LLMs prohibits niche industries from developing custom question-answering bots. The complete paper tests several commercial LLMs for information-retrieval tasks for drilling data using zero-shot in-context learning. In addition, the model’s calibration is tested with a few-shot multiple-choice drilling questionnaire. Introduction While LLMs have proven effective in various tasks ranging from sentiment analysis to text completion, using LLMs for question-answering tasks presents a challenge in providing factual responses. Pretrained LLMs only serve as a parameterized implicit knowledge base and cannot access recent data; thus, information is bounded by the time of training. Retrieval augmented generation (RAG) can address some of these issues by extending the utility of LLMs to specific data sources. Fig. 1 shows a simplified RAG-based LLM question/answer application. RAG involves two primary components: document retrieval (green boxes), which retrieves the most relevant context based on the query, and LLM response generation (blue boxes). During the response generation, LLM operates based on the prompt, query, and retrieved context without any change in the model parameters, a process the authors term as “in-context learning.” Methodology Two experiments have been conducted: The first one is a few-shot multiple-choice experiment evaluated using the SLB drilling glossary; the second is a zero-shot in-context experiment evaluated on drilling reports and company reports. Multiple-Choice Experiment. SLB Drilling Glossary. For the multiple-choice experiment, a publicly available drilling glossary served as a basis for evaluation. A total of 409 term/definition pairs were considered. Five term/definition pairs were chosen, serving as few-shot default values, while the remaining 404 pairs served as the multiple-choice questions. Four choices were given for each term/definition question pair, where one was the correct answer. The three incorrect choices were picked randomly from all possible terms minus the true answer. Zero-Shot In-Context Experiment. Norwegian Petroleum Directorate (NPD) Database. The authors explored the wellbore history of all individual exploration wells drilled in the Norwegian shelf in the NPD database. In this experiment, 12 exploration wells were randomly chosen for evaluation. In addition to these drilling reports, information about the stratigraphy of three additional wells was added. Annual Reports. Annual reports of two major operators in Norway for 2020 and 2021 also were considered. These consisted of short summaries that presented the main operational and economic results achieved by the company throughout the year. These reports were added to the evaluation to balance the higher technical content of the wellbore-history reports.

  • Research Article
  • Cite Count Icon 37
  • 10.1145/3461764
Sentiment Analysis Using XLM-R Transformer and Zero-shot Transfer Learning on Resource-poor Indian Language
  • Jun 30, 2021
  • ACM Transactions on Asian and Low-Resource Language Information Processing
  • Akshi Kumar + 1 more

Sentiment analysis on social media relies on comprehending the natural language and using a robust machine learning technique that learns multiple layers of representations or features of the data and produces state-of-the-art prediction results. The cultural miscellanies, geographically limited trending topic hash-tags, access to aboriginal language keyboards, and conversational comfort in native language compound the linguistic challenges of sentiment analysis. This research evaluates the performance of cross-lingual contextual word embeddings and zero-shot transfer learning in projecting predictions from resource-rich English to resource-poor Hindi language. The cross-lingual XLM-RoBERTa classification model is trained and fine-tuned using the English language Benchmark SemEval 2017 dataset Task 4 A and subsequently zero-shot transfer learning is used to evaluate the classification model on two Hindi sentence-level sentiment analysis datasets, namely, IITP-Movie and IITP-Product review datasets. The proposed model compares favorably to state-of-the-art approaches and gives an effective solution to sentence-level (tweet-level) analysis of sentiments in a resource-poor scenario. The proposed model compares favorably to state-of-the-art approaches and achieves an average performance accuracy of 60.93 on both the Hindi datasets.

  • Research Article
  • 10.37256/ccds.6220256316
Challenges in Detecting Nuanced Sentiment with Advanced Models
  • Mar 11, 2025
  • Cloud Computing and Data Science
  • Edgar Ceh-Varela + 2 more

Sentiment analysis, an essential task in Natural Language Processing (NLP), determines the sentiment expressed in texts. This paper compares six different sentiment analysis models, categorized into three groups based on their underlying techniques: lexicon-based, machine learning-based, and zero-shot learning. The models are evaluated on four publicly available datasets (Movie Reviews, Amazon, Yelp, and Financial), each varying in complexity. The main objective is to assess the efficiency of these models in both binary (positive and negative) and ternary (positive, neutral, and negative) sentiment classification scenarios. Our results indicate that for binary classification, pre-trained large-scale NLP state-of-the-art models outperform other approaches, demonstrating superior results across all evaluated metrics. On average, across all datasets, these models achieved 94% accuracy, 96% precision, 94% recall, and 94% F1-score. However, these pre-trained NLP models face significant challenges in three-class classification tasks, where their performance noticeably declines. Achieving on average, across datasets, 60% accuracy, 66% precision, 60% recall, and 56% F1-score. This study highlights the limitations of current state-of-the-art models in handling more subtle sentiment distinctions. It emphasizes the need for further advancements in sentiment analysis techniques to effectively manage multi-class sentiment categorization that captures and interprets specialized jargon, technical terminology, and nuanced language.

  • Research Article
  • Cite Count Icon 11
  • 10.54097/vfwgas09
Artificial Intelligence Methods in Natural Language Processing: A Comprehensive Review
  • Mar 13, 2024
  • Highlights in Science, Engineering and Technology
  • Yanhan Chen + 3 more

The rapid evolution of Artificial Intelligence (AI) since its inception in the mid-20th century has significantly influenced the field of Natural Language Processing (NLP), transforming it from a rule-based system to a dynamic and adaptive model capable of understanding the complexities of human language. This paper aims to offer a comprehensive review of the various applications and methodologies of AI in NLP, serving as a detailed guide for future research and practical applications. In the early sections, the paper elucidates the indispensable role of AI in NLP, highlighting its transition from symbolic reasoning to a focus on machine learning and deep learning, and its extensive applications in sectors such as healthcare, transportation, and finance. It emphasizes the symbiotic relationship between AI and NLP, facilitated by platforms like AllenNLP, which aid in the development of advanced language understanding models. Further, the paper explores specific AI techniques employed in NLP, including machine learning, Naive Bayes, and Support Vector Machines, and identifies pressing challenges and avenues for future research. It delves into the applications of AI in NLP, showcasing its transformative potential in tasks such as machine translation, facilitated by deep learning methods, and the development of chatbots and virtual assistants that have revolutionized human-technology interaction. The paper also highlights other fields impacted by AI techniques, including text summarization, sentiment analysis, and named entity recognition, emphasizing the efficiency and accuracy brought about by the integration of AI in these areas. In conclusion, the paper summarizes the remarkable advancements and persistent challenges in NLP, such as language ambiguity and contextual understanding, and underscores the need for diverse and representative labeled data for training. Looking forward, it identifies promising research avenues including Explainable AI, Few-shot and Zero-shot Learning, and the integration of NLP with other data modalities, aiming for a holistic understanding of multimodal data. The paper calls for enhanced robustness and security in NLP systems, especially in sensitive applications like content moderation and fake news detection, to foster trust and reliability in AI technologies. It advocates for continual learning in NLP models to adapt over time without losing previously acquired knowledge, paving the way for a future where AI and NLP work synergistically to understand and generate human language more effectively and efficiently.

  • Research Article
  • 10.35631/jistm.1039002
THE FUTURE OF BANGLA SENTIMENT ANALYSIS: ADVANCEMENTS, CHALLENGES, AND OPPORTUNITIES FOR PRACTICAL AND RESEARCH INNOVATION
  • Jun 5, 2025
  • Journal of Information System and Technology Management
  • Md Riaz Hasan + 3 more

Bangla sentiment analysis has advanced significantly, transitioning from rule-based models and lexicons to deep learning and transformer-based architectures. Despite these developments, the field still faces critical challenges, including limited labeled data, complex morphology, code-mixed language, and dialectal variation. Although recent models and datasets have improved accuracy, key issues remain such as narrow domain coverage, underexplored aspect-based and emotion classification, and potential ethical concerns related to bias and fairness. This paper critically examines current approaches, including deep neural and cross-lingual models, and highlights new frontiers like multimodal sentiment analysis and real-time inference. It also outlines strategic directions for future research, focusing on zero-shot learning, dialogue-based sentiment detection, and fairness-aware frameworks. The study aims to provide a roadmap for making Bangla sentiment analysis both technologically robust and socially responsible.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/sist54437.2022.9945811
Sentiment Analysis of Reviews in Kazakh With Transfer Learning Techniques
  • Apr 28, 2022
  • Aliya Nugumanova + 2 more

Heavily pretrained transformer models, such as Bidirectional Encoder Representations from Transformers (BERT) or Generative Pre-trained Transformer (GPT), have successfully demonstrated the superior ability to recognize the right sentiments of texts in English or other dominant languages. However, for low-resource languages such as Kazakh, there are no similar models due to the high computational and memory requirements for their training and the lack of labeled datasets. Under this circumstance, transfer learning can be applied to low-resource language using a pretrained multilingual or related-language model. In this paper, we consider two ways to implement the transfer learning strategy: zero-shot learning and fine-tuning. We design experiments to compare these two methods and report the obtained results. Experiments show that in both cases BERT-based multilingual sentiment analysis model performs better than the BERT-based model for Turkish language, and the performance of these models grows after fine-tuning even with a very small number of samples in Kazakh.

  • Research Article
  • Cite Count Icon 5
  • 10.1609/aaai.v35i18.17967
Zera-Shot Sentiment Analysis for Code-Mixed Data
  • May 18, 2021
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Siddharth Yadav + 1 more

Code-mixing is the practice of alternating between two or more languages. A major part of sentiment analysis research has been monolingual and they perform poorly on the code-mixed text. We introduce methods that use multilingual and cross-lingual embeddings to transfer knowledge from monolingual text to code-mixed text for code-mixed sentiment analysis. Our methods handle code-mixed text through zero-shot learning and beat state-of-the-art English-Spanish code-mixed sentiment analysis by an absolute 3% F1-score. We are able to achieve 0.58 F1-score (without a parallel corpus) and 0.62 F1-score (with the parallel corpus) on the same benchmark in a zero-shot way as compared to 0.68 F1-score in supervised settings. Our code is publicly available on github.com/sedflix/unsacmt.

  • Research Article
  • 10.55041/isjem05164
A Modular Architecture for Scalable Multilingual Natural Language Processing
  • Nov 19, 2025
  • International Scientific Journal of Engineering and Management
  • Ravi Deo + 2 more

Abstract—The exponential growth of global digital content has precipitated an urgent demand for Natural Language Processing (NLP) systems capable of operating seamlessly across a multi- tude of languages. However, the prevailing paradigm in NLP development remains predominantly monolingual, necessitating the construction of disparate, resource-intensive, and often in- consistent pipelines for individual languages. This fragmented approach is inherently inefficient, economically burdensome, and imposes severe scalability constraints, thereby exacerbating the “digital language divide.” This paper proposes a novel, unified Modular Multilingual NLP (MM-NLP) architecture designed to streamline cross-lingual analysis through a cohesive, high- throughput workflow. The proposed system integrates a hier- archical automatic language detection mechanism powered by fastText, a dynamic routing layer for language-specific tokeniza- tion, and a massive shared multilingual transformer backbone (XLM-RoBERTa) utilizing cross-lingual transfer learning. By establishing a unified high-dimensional vector space for textual representations, the pipeline enables task-specific heads—such as sentiment analysis and named entity recognition—to be applied agnostically across diverse linguistic inputs. We present comprehensive experimental results demonstrating that our zero- shot transfer approach achieves 88% of the performance of fully supervised monolingual models while reducing computational overhead by 75% and deployment complexity by an order of magnitude. Furthermore, we introduce a standardized fairness evaluation module utilizing scikit-learn metrics to detect and mitigate cross-lingual performance disparities. Index Terms—Natural Language Processing, Multilingualism, Modular Architecture, Transformers, Zero-Shot Learning, Cross- lingual Transfer, Scalability, Computational Sustainability.

  • Research Article
  • 10.63503/j.ijcma.2025.94
Few-Shot Sentiment Adaptation: A MAML -Based Framework for Low-Resource NLP
  • May 8, 2025
  • International Journal on Computational Modelling Applications
  • Preeti Bala + 4 more

Sentiment analysis in low-resource languages has a tough obstacle to jump over because there just isn’t enough labeled data. Traditional deep learning models tend to hit a wall in these situations since they need large datasets to really shine. To tackle this issue, we came up with Few-Shot Sentiment Adaptation (FSSA), a meta-learning framework based on MAML. This cool approach lets us classify sentiment with just a few labeled examples. By training on languages that have lots of resources and then adapting to those that don’t, we can quickly pick up on sentiment patterns using a 5-way, 5-shot method. We tested FSSA on some public low-resource sentiment datasets and compared it with fine- tuned BERT models, zero-shot learning, and other few-shot classification techniques. Our results showed a major improvement over existing methods, proving to be pretty adaptable even when dealing with limited data. Unlike the usual transfer learning methods, FSSA allows for quick tweaks without needing extensive fine-tuning, making it ideal for real-world low-resource Natural Language Processing (NLP) applications. This research helps bridge the gap between high- and low-resource languages in the NLP field, reducing the reliance on large annotated datasets. We’ve made few-shot meta-learning for sentiment analysis easier to understand, paving the way for more strong and efficient language models.

  • Research Article
  • Cite Count Icon 21
  • 10.1016/j.eswa.2022.118119
Aspect2Labels: A novelistic decision support system for higher educational institutions by using multi-layer topic modelling approach
  • Jul 19, 2022
  • Expert Systems with Applications
  • Shabir Hussain + 9 more

Aspect2Labels: A novelistic decision support system for higher educational institutions by using multi-layer topic modelling approach

  • Research Article
  • 10.1007/s13278-025-01434-x
Multilingual user perceptions analysis from twitter using zero shot learning for border control technologies
  • Mar 18, 2025
  • Social Network Analysis and Mining
  • Sarang Shaikh + 3 more

Online social networks such as Twitter, Facebook, Instagram, and Reddit have transformed communications by enabling users to share their opinions and perceptions. The vast amount of user-generated content on these platforms poses significant challenges for manual analysis. Advances in artificial intelligence, particularly transformer-based models such as BERT and GPT, have improved the processing of multilingual data for tasks such as text classification, sentiment analysis, and emotion analysis. However, these models often require extensive task-specific training and high-quality labeled data, making them impractical for multilingual contexts. This study addresses these limitations by leveraging zero-shot learning with transformer-based models, which eliminate the need for task-specific training and can classify new data into unseen classes without manual annotation. The use case for this study is border control technologies (BCTs), a hot topic following the European Union commission’s “Smart Borders Package" aimed at improving border crossing points efficiency and security. The major contribution of this study lies in introducing a novel framework to explore the multilingual user perceptions, focusing on BCTs using an innovative “user perception extraction architecture" for analyzing multilingual perceptions from Twitter. This architecture enables modular, scalable, and domain-independent analysis that is adaptable to various emerging technologies and domains beyond BCTs. Furthermore, this study compiles a unique dataset of 90,789 multilingual tweets related to BCTs from 2008 to 2022, providing valuable insights into public perceptions for BCTs. The findings reveal dynamic trends in user perceptions influenced by geopolitical events and policy changes, offering actionable insights for policymakers, researchers, and developers. By contextualizing these findings, this study equips stakeholders with new knowledge to bridge the gap between public concerns and adoption of BCTs.

  • Conference Article
  • Cite Count Icon 140
  • 10.1145/3551349.3559555
Few-shot training LLMs for project-specific code-summarization
  • Oct 10, 2022
  • Toufique Ahmed + 1 more

Very large language models (LLMs), such as GPT-3 and Codex have achieved state-of-the-art performance on several natural-language tasks, and show great promise also for code. A particularly exciting aspect of LLMs is their knack for few-shot and zero-shot learning: they can learn to perform a task with very few examples. Few-shotting has particular synergies in software engineering, where there are a lot of phenomena (identifier names, APIs, terminology, coding patterns) that are known to be highly project-specific. However, project-specific data can be quite limited, especially early in the history of a project; thus the few-shot learning capacity of LLMs might be very relevant. In this paper, we investigate the use few-shot training with the very large GPT (Generative Pre-trained Transformer) Codex model, and find evidence suggesting that one can significantly surpass state-of-the-art models for code-summarization, leveraging project-specific training.

  • Conference Article
  • Cite Count Icon 9
  • 10.1109/siu53274.2021.9477890
Sentiment Analysis of Customer Comments in Banking using BERT-based Approaches
  • Jun 9, 2021
  • Melik Masarifoglu + 6 more

Customer comments collected by companies through various channels are useful resources for understanding customer satisfaction. The continuous increase in the amount of comments makes manual analysis infeasible. In this study, the comments of customers, written in Turkish, regarding banking services collected through NPS questionnaires were analyzed using Natural Language Processing methods. BERT-based sentiment classification models were developed and compared with traditional methods for the banking domain. The effectiveness of the methods was investigated in a low-resource setting, where (i) there is a small amount of labeled training data and (ii) there is no labeled training data in the target domain. For the first case, the results showed that BERTurk-based model performs better than the traditional models and its performance is affected less from the decrease in training data size. For the second case, training with out of domain data from Twitter was explored. In addition, zero-shot learning with XLM-Roberta, which was pertained for natural language inference, was investigated. While using out of domain data resulted in poor performance, the zero-shot learning approach achieved promising results for sentiment classification in the banking domain.

  • Research Article
  • Cite Count Icon 7
  • 10.3390/info15080499
Exploring Tourist Experience through Online Reviews Using Aspect-Based Sentiment Analysis with Zero-Shot Learning for Hospitality Service Enhancement
  • Aug 20, 2024
  • Information
  • Ibrahim Nawawi + 3 more

Hospitality services play a crucial role in shaping tourist satisfaction and revisiting intention toward destinations. Traditional feedback methods like surveys often fail to capture the nuanced and real-time experiences of tourists. Digital platforms such as TripAdvisor, Yelp, and Google Reviews provide a rich source of user-generated content, but the sheer volume of reviews makes manual analysis impractical. This study proposes integrating aspect-based sentiment analysis with zero-shot learning to analyze online tourist reviews effectively without requiring extensive annotated datasets. Using pretrained models like RoBERTa, the research framework involves keyword extraction, sentence segment detection, aspect construction, and sentiment polarity measurement. The dataset, sourced from TripAdvisor reviews of attractions, hotels, and restaurants in Central Java, Indonesia, underwent preprocessing to ensure suitability for analysis. The results highlight the importance of aspects such as food, accommodation, and cultural experiences in tourist satisfaction. The findings indicate a need for continuous service improvement to meet evolving tourist expectations, demonstrating the potential of advanced natural language processing techniques in enhancing hospitality services and customer satisfaction.

  • Research Article
  • Cite Count Icon 2
  • 10.1002/pra2.778
Voices of the Stacks: A Multifaceted Inquiry into Academic Librarians' Tweets
  • Oct 1, 2023
  • Proceedings of the Association for Information Science and Technology
  • Souvick Ghosh + 1 more

Twitter has emerged as an important forum for discussion among academic librarians. In this research, we take a mixed‐methods approach to study the thematic content and sentiment of tweets authored by academic librarians in the United States, Canada, and the United Kingdom. We found differences in the semantic content and themes present in the data from each country that point to differences in how librarians in each country engage on Twitter. While more work remains to be done, we cast new light on how members of professional communities use social media. Our qualitative analysis identified 11 thematic categories in academic librarians' Twitter discussions, focusing on professional topics. UK librarians exhibited a higher frequency of labor‐ and employment‐related terms compared to their US and Canadian counterparts. Sentiment ratios for US and Canadian tweets were similar, while the UK displayed nearly double the positive‐to‐negative tweet ratio. We also present a methodological intervention comparing two different sentiment analysis methods, VADER, and Zero‐Shot Learning (ZSL), to classify posts by academic librarians. ZSL significantly outperformed the off‐the‐shelf classifier, highlighting how accurate prediction is possible without annotated training data.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.