Sentiment Corpus Research Articles

In this paper, we propose a semi-supervised approach for sentiment analysis of Arabic and its dialects. This approach is based on a sentiment corpus, constructed automatically and reviewed manually by Algerian dialect native speakers. This approach consists of constructing and applying a set of deep learning algorithms to classify the sentiment of Arabic messages as positive or negative. It was applied on Facebook messages written in Modern Standard Arabic (MSA) as well as in Algerian dialect (DALG, which is a low resourced-dialect, spoken by more than 40 million people) with both scripts Arabic and Arabizi. To handle Arabizi, we consider both options: transliteration (largely used in the research literature for handling Arabizi) and translation (never used in the research literature for handling Arabizi). For highlighting the effectiveness of a semi-supervised approach, we carried out different experiments using both corpora for the training (i.e. the corpus constructed automatically and the one that was reviewed manually). The experiments were done on many test corpora dedicated to MSA/DALG, which were proposed and evaluated in the research literature. Both classifiers are used, shallow and deep learning classifiers such as Random Forest (RF), Logistic Regression(LR) Convolutional Neural Network (CNN) and Long short-term memory (LSTM). These classifiers are combined with word embedding models such as Word2vec and fastText that were used for sentiment classification. Experimental results (F1 score up to 95% for intrinsic experiments and up to 89% for extrinsic experiments) showed that the proposed system outperforms the existing state-of-the-art methodologies (the best improvement is up to 25%).

In this paper, we propose Stacked DeBERT, short for StackedDenoising Bidirectional Encoder Representations from Transformers. This novel model improves robustness in incomplete data, when compared to existing systems, by designing a novel encoding scheme in BERT, a powerful language representation model solely based on attention mechanisms. Incomplete data in natural language processing refer to text with missing or incorrect words, and its presence can hinder the performance of current models that were not implemented to withstand such noises, but must still perform well even under duress. This is due to the fact that current approaches are built for and trained with clean and complete data, and thus are not able to extract features that can adequately represent incomplete data. Our proposed approach consists of obtaining intermediate input representations by applying an embedding layer to the input tokens followed by vanilla transformers. These intermediate features are given as input to novel denoising transformers which are responsible for obtaining richer input representations. The proposed approach takes advantage of stacks of multilayer perceptrons for the reconstruction of missing words’ embeddings by extracting more abstract and meaningful hidden feature vectors, and bidirectional transformers for improved embedding representation. We consider two datasets for training and evaluation: the Chatbot Natural Language Understanding Evaluation Corpus and Kaggle’s Twitter Sentiment Corpus. Our model shows improved F1-scores and better robustness in informal/incorrect texts present in tweets and in texts with Speech-to-Text error in the sentiment and intent classification tasks.11https://github.com/gcunhase/StackedDeBERT.

Sentiment Corpus Research Articles

Related Topics

Articles published on Sentiment Corpus

Feature Extraction Network with Attention Mechanism for Data Enhancement and Recombination Fusion for Multimodal Sentiment Analysis

Modeling Public Sentiments About JUUL Flavors on Twitter Through Machine Learning.

Monotone submodular subset for sentiment analysis of online reviews

AraSenCorpus: A Semi-Supervised Approach for Sentiment Annotation of a Large Arabic Text Corpus

A Semi-supervised Approach for Sentiment Analysis of Arab(ic+izi) Messages: Application to the Algerian Dialect

Experiments in Text Classification: Analyzing the Sentiment of Electronic Product Reviews in Greek

Automatic Indonesian Sentiment Lexicon Curation with Sentiment Valence Tuning for Social Media Sentiment Analysis

Stacked DeBERT: All attention in incomplete data for text classification

ArAutoSenti: automatic annotation and new tendencies for sentiment classification of Arabic messages

Arabic dialect sentiment analysis with ZERO effort. \\ Case study: Algerian dialect

Corpus Development for Malay Sentiment Analysis Using Semi Supervised Approach

Cross-lingual deep neural transfer learning in sentiment analysis

A mixed approach of statistical weighting method and unsupervised method to improve Uyghur sentiment classification

Arabic sentiment analysis: studies, resources, and tools

A Check on Annotation in Sentiment Research

Sentiment Classification of Reviews Based on BiGRU Neural Network and Fine-grained Attention

Attention-based long short-term memory network using sentiment lexicon embedding for aspect-level sentiment analysis in Korean

Sentiment analysis through recurrent variants latterly on convolutional neural network of Twitter

Mapping Consumer Sentiment Toward Wireless Services Using Geospatial Twitter Data

Identification of fact-implied implicit sentiment based on multi-level semantic fused representation

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Sentiment Corpus Research Articles

Related Topics

Articles published on Sentiment Corpus

Feature Extraction Network with Attention Mechanism for Data Enhancement and Recombination Fusion for Multimodal Sentiment Analysis

Modeling Public Sentiments About JUUL Flavors on Twitter Through Machine Learning.

Monotone submodular subset for sentiment analysis of online reviews

AraSenCorpus: A Semi-Supervised Approach for Sentiment Annotation of a Large Arabic Text Corpus

A Semi-supervised Approach for Sentiment Analysis of Arab(ic+izi) Messages: Application to the Algerian Dialect

Experiments in Text Classification: Analyzing the Sentiment of Electronic Product Reviews in Greek

Automatic Indonesian Sentiment Lexicon Curation with Sentiment Valence Tuning for Social Media Sentiment Analysis

Stacked DeBERT: All attention in incomplete data for text classification

ArAutoSenti: automatic annotation and new tendencies for sentiment classification of Arabic messages

Arabic dialect sentiment analysis with ZERO effort. \\ Case study: Algerian dialect

Corpus Development for Malay Sentiment Analysis Using Semi Supervised Approach

Cross-lingual deep neural transfer learning in sentiment analysis

A mixed approach of statistical weighting method and unsupervised method to improve Uyghur sentiment classification

Arabic sentiment analysis: studies, resources, and tools

A Check on Annotation in Sentiment Research

Sentiment Classification of Reviews Based on BiGRU Neural Network and Fine-grained Attention

Attention-based long short-term memory network using sentiment lexicon embedding for aspect-level sentiment analysis in Korean

Sentiment analysis through recurrent variants latterly on convolutional neural network of Twitter

Mapping Consumer Sentiment Toward Wireless Services Using Geospatial Twitter Data

Identification of fact-implied implicit sentiment based on multi-level semantic fused representation