Human Annotators Research Articles

Background: Mobile app usage is increasing in the digital age, with Ride-Hailing app becoming the primary example of this trend. To obtain valuable understanding of how people perceive and interact with mobile app, user reviews on platforms such as Google Play are usually analyzed. This analysis can assist developers to identify areas for improvement in both Ride-hailing and Google Play App. A promising method that can be used to analyze user perception in this instance is Aspect-Based Sentiment Analysis (ABSA). Objective: This research aimed to apply ABSA to user reviews using Bidirectional Encoder Representations from Transformers (BERT) models. In this context, aspect identification and topic modeling were performed by using Latent Dirichlet Allocation (LDA). The model extracted topics from the reviews and used Generative Artificial Intelligence (GenAI) to define the aspects of the topics to further enhance the analysis. For consistency and accuracy, the method included sentiment annotation by a human annotator. Methods: A total of two datasets were used in this research, with the first collected by scraping user reviews of Ride-Hailing App while the second was obtained from Kaggle, and to identify relevant topics, modeling was performed using LDA. These topics were then categorized into aspects using GenAI, covering areas, such as customer experience, service, payment, app features, task management, and event management. Subsequently, sentiment labeling was conducted using human annotators to provide a reliable baseline. BERT model was then used to classify sentiment with aspect hints, and the evaluation included calculations of accuracy, precision, recall, and F1-score. Results: The results showed that BERT model achieved the highest accuracy of 97% in sentiment analysis across all datasets. Conclusion: This research provided valuable understanding of user experience and established a strong ABSA framework for analyzing user reviews using LDA, Aspect Annotation, GenAI, and BERT sentiment models. Future research could expand this method to other app categories and incorporate real-time ABSA for continuous monitoring and dynamic feedback. Keywords: User Reviews, Aspect-Based Sentiment Analysis (ABSA), Sentiment Analysis, Topic Modeling, Generative Artificial Intelligence (GenAI)

Read full abstract

For software evolution, user feedback has become a meaningful way to improve applications. Recent studies show a significant increase in analyzing end-user feedback from various social media platforms for software evolution. However, less attention has been given to the end-user feedback for low-rating software applications. Also, such approaches are developed mainly on the understanding of human annotators who might have subconsciously tried for a second guess, questioning the validity of the methods. For this purpose, we proposed an approach that analyzes end-user feedback for low-rating applications to identify the end-user opinion types associated with negative reviews (an issue or bug). Also, we utilized Generative Artificial Intelligence (AI), i.e., ChatGPT, as an annotator and negotiator when preparing a truth set for the deep learning (DL) classifiers to identify end-user emotion. For the proposed approach, we first used the ChatGPT Application Programming Interface (API) to identify negative end-user feedback by processing 71853 reviews collected from 45 apps in the Amazon store. Next, a novel grounded theory is developed by manually processing end-user negative feedback to identify frequently associated emotion types, including anger, confusion, disgust, distrust, disappointment, fear, frustration, and sadness. Next, two datasets were developed, one with human annotators using a content analysis approach and the other using ChatGPT API with the identified emotion types. Next, another round is conducted with ChatGPT to negotiate over the conflicts with the human-annotated dataset, resulting in a conflict-free emotion detection dataset. Finally, various DL classifiers, including LSTM, BILSTM, CNN, RNN, GRU, BiGRU and BiRNN, are employed to identify their efficacy in detecting end-users emotions by preprocessing the input data, applying feature engineering, balancing the data set, and then training and testing them using a cross-validation approach. We obtained an average accuracy of 94%, 94%, 93%, 92%, 91%, 91%, and 85%, with LSTM, BILSTM, RNN, CNN, GRU, BiGRU and BiRNN, respectively, showing improved results with the truth set curated with human and ChatGPT. Using ChatGPT as an annotator and negotiator can help automate and validate the annotation process, resulting in better DL performances.

Read full abstract

Human Annotators Research Articles

Related Topics

Articles published on Human Annotators

Influence Reasoning Capabilities of Large Language Models in Social Environments

Accurate Acupoint Localization in 2D Hand Images: Evaluating HRNet and ResNet Architectures for Enhanced Detection Performance

MultiADE: A Multi-domain benchmark for Adverse Drug Event extraction

Enhancing scene text detectors with realistic text image synthesis using diffusion models

A novel explainable machine learning-based healthy ageing scale

Unveiling User Sentiment: Aspect-Based Analysis and Topic Modeling of Ride-Hailing and Google Play App Reviews

Tobacco control policies discussed on social media: a scoping review

Automated tree crown labeling with 3D radiative transfer modelling achieves human comparable performances for tree segmentation in semi-arid landscapes

Skin feature point tracking using deep feature encodings

Estimating Contribution Quality in Online Deliberations Using a Large Language Model

Utility-Oriented Knowledge Graph Accuracy Estimation with Limited Annotations: A Case Study on DBpedia

A natural language processing approach to detect inconsistencies in death investigation notes attributing suicide circumstances

Leveraging Large Language Model ChatGPT for enhanced understanding of end-user emotions in social media feedbacks

Fast Context-Aware Analysis of Genome Annotation Colocalization.

Augmenting biomedical named entity recognition with general-domain resources

Variation in forest root image annotation by experts, novices, and AI

Multi-level discriminator based contrastive learning for multiplex networks

Algorithmic assessment of drag on thermally cut sheet metal edges

Automated Classification of Elementary Instructional Activities

Evidence and Axial Attention Guided Document-level Relation Extraction

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Human Annotators Research Articles

Related Topics

Articles published on Human Annotators

Influence Reasoning Capabilities of Large Language Models in Social Environments

Accurate Acupoint Localization in 2D Hand Images: Evaluating HRNet and ResNet Architectures for Enhanced Detection Performance

MultiADE: A Multi-domain benchmark for Adverse Drug Event extraction

Enhancing scene text detectors with realistic text image synthesis using diffusion models

A novel explainable machine learning-based healthy ageing scale

Unveiling User Sentiment: Aspect-Based Analysis and Topic Modeling of Ride-Hailing and Google Play App Reviews

Tobacco control policies discussed on social media: a scoping review

Automated tree crown labeling with 3D radiative transfer modelling achieves human comparable performances for tree segmentation in semi-arid landscapes

Skin feature point tracking using deep feature encodings

Estimating Contribution Quality in Online Deliberations Using a Large Language Model

Utility-Oriented Knowledge Graph Accuracy Estimation with Limited Annotations: A Case Study on DBpedia

A natural language processing approach to detect inconsistencies in death investigation notes attributing suicide circumstances

Leveraging Large Language Model ChatGPT for enhanced understanding of end-user emotions in social media feedbacks

Fast Context-Aware Analysis of Genome Annotation Colocalization.

Augmenting biomedical named entity recognition with general-domain resources

Variation in forest root image annotation by experts, novices, and AI

Multi-level discriminator based contrastive learning for multiplex networks

Algorithmic assessment of drag on thermally cut sheet metal edges

Automated Classification of Elementary Instructional Activities

Evidence and Axial Attention Guided Document-level Relation Extraction