Cross-lingual Sentiment Analysis Research Articles

In recent years, sentiment analysis has become a focal point in natural language processing. Cross-lingual sentiment analysis is a particularly demanding yet essential task that seeks to construct models capable of effectively analyzing sentiments across a variety of languages. The primary motivation behind this research is to bridge the gap in current techniques that often struggle to perform well with low-resource languages, due to the scarcity of large, annotated datasets, and their unique linguistic characteristics. In light of these challenges, we propose a novel Multi-Stage Deep Learning Architecture (MSDLA) for cross-lingual sentiment analysis of the Tamil language, a low-resource language. Our approach utilizes transfer learning from a source language with abundant resources to overcome data limitations. Our proposed model significantly outperforms existing methods on the Tamil Movie Review dataset, achieving an accuracy, precision, recall, and F1-score of 0.8772, 0.8614, 0.8825, and 0.8718, respectively. ANOVA statistical comparison demonstrates that the MSDLA’s improvements over other models, including mT5, XLM, mBERT, ULMFiT, BiLSTM, LSTM with Attention, and ALBERT with Hugging Face English Embedding are significant, with p-values all less than 0.005. Ablation studies confirm the importance of both cross-lingual semantic attention and domain adaptation in our architecture. Without these components, the model’s performance drops to 0.8342 and 0.8043 in accuracy, respectively. Furthermore, MSDLA demonstrates robust cross-domain performance on the Tamil News Classification and Thirukkural datasets, achieving an accuracy of 0.8551 and 0.8624, respectively, significantly outperforming the baseline models. These findings illustrate the robustness and efficacy of our approach, making a significant contribution to cross-lingual sentiment analysis techniques, especially for low-resource languages.

Read full abstract

The advent of pre-trained language models has directed a new era of Natural Language Processing (NLP), enabling us to create powerful language models. Among these models, Transformer-based models like BERT have grown in popularity due to their cutting-edge effectiveness. However, these models heavily rely on resource-intensive languages, forcing other languages into multilingual models(mBERT). The two fundamental challenges with mBERT become significantly more challenging in a resource-constrained language like Bangla. It was trained on a limited and organized dataset and contained weights for all other languages. Besides, current research on other languages suggests that a language-specific BERT model will exceed multilingual ones. This paper introduces Bangla-BERT<sup>a</sup>, a monolingual BERT model for the Bangla language. Despite the limited data available for NLP tasks in Bangla, we perform pre-training on the largest Bangla language model dataset, BanglaLM, which we constructed using 40 GB of text data. Bangla-BERT achieves the highest results in all datasets and vastly improves the state-of-the-art performance in binary linguistic classification, multilabel extraction, and named entity recognition, outperforming multilingual BERT and other previous research. The pre-trained model is assessed against several non-contextual models such as Bangla fasttext and word2vec the downstream tasks. Finally, this model is evaluated by transfer learning based on hybrid deep learning models such as LSTM, CNN, and CRF in NER, and it is observed that Bangla-BERT outperforms state-of-the-art methods. The proposed Bangla-BERT model is assessed by using benchmark datasets, including Banfakenews, Sentiment Analysis on Bengali News Comments, and Cross-lingual Sentiment Analysis in Bengali. Finally, it is concluded that Bangla-BERT surpasses all prior state-of-the-art results by 3.52%, 2.2%, and 5.3%.

Read full abstract

Cross-lingual Sentiment Analysis Research Articles

Related Topics

Articles published on Cross-lingual Sentiment Analysis

Improving Cross-lingual Aspect-based Sentiment Analysis with Sememe Bridge

A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM

A Systematic Review of Cross-Lingual Sentiment Analysis: Tasks, Strategies, and Prospects

KNetwork: advancing cross-lingual sentiment analysis for enhanced decision-making in linguistically diverse environments

A comparative study of cross-lingual sentiment analysis

Cross-Lingual Sentiment Analysis: Comparative Study of Opinion Expression Across Different Languages

Cross-lingual Sentiment Analysis of Tamil Language Using a Multi-stage Deep Learning Architecture

How a Deep Contextualized Representation and Attention Mechanism Justifies Explainable Cross-Lingual Sentiment Analysis

Advances in Sentiment Analysis in Deep Learning Models and Techniques

Semantic Orientation of Crosslingual Sentiments: Employment of Lexicon and Dictionaries

Multi-Label Sentiment Analysis on 100 Languages With Dynamic Weighting for Label Imbalance.

Zero-shot learning based cross-lingual sentiment analysis for sanskrit text with insufficient labeled data

Cross-Lingual Knowledge Transferring by Structural Correspondence and Space Transfer.

Bangla-BERT: Transformer-Based Efficient Model for Transfer Learning and Language Understanding

English–Welsh Cross-Lingual Embeddings

Cross Lingual Sentiment Analysis: A Clustering-Based Bee Colony Instance Selection and Target-Based Feature Weighting Approach

On the Effect of Word Order on Cross-lingual Sentiment Analysis

Developing Cross-lingual Sentiment Analysis of Malay Twitter Data Using Lexicon-based Approach

Cross-lingual sentiment transfer with limited resources

Structural Correspondence Learning for Cross-Lingual Sentiment Classification with One-to-Many Mappings

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Cross-lingual Sentiment Analysis Research Articles

Related Topics

Articles published on Cross-lingual Sentiment Analysis

Improving Cross-lingual Aspect-based Sentiment Analysis with Sememe Bridge

A multimodal approach to cross-lingual sentiment analysis with ensemble of transformer and LLM

A Systematic Review of Cross-Lingual Sentiment Analysis: Tasks, Strategies, and Prospects

KNetwork: advancing cross-lingual sentiment analysis for enhanced decision-making in linguistically diverse environments

A comparative study of cross-lingual sentiment analysis

Cross-Lingual Sentiment Analysis: Comparative Study of Opinion Expression Across Different Languages

Cross-lingual Sentiment Analysis of Tamil Language Using a Multi-stage Deep Learning Architecture

How a Deep Contextualized Representation and Attention Mechanism Justifies Explainable Cross-Lingual Sentiment Analysis

Advances in Sentiment Analysis in Deep Learning Models and Techniques

Semantic Orientation of Crosslingual Sentiments: Employment of Lexicon and Dictionaries

Multi-Label Sentiment Analysis on 100 Languages With Dynamic Weighting for Label Imbalance.

Zero-shot learning based cross-lingual sentiment analysis for sanskrit text with insufficient labeled data

Cross-Lingual Knowledge Transferring by Structural Correspondence and Space Transfer.

Bangla-BERT: Transformer-Based Efficient Model for Transfer Learning and Language Understanding

English–Welsh Cross-Lingual Embeddings

Cross Lingual Sentiment Analysis: A Clustering-Based Bee Colony Instance Selection and Target-Based Feature Weighting Approach

On the Effect of Word Order on Cross-lingual Sentiment Analysis

Developing Cross-lingual Sentiment Analysis of Malay Twitter Data Using Lexicon-based Approach

Cross-lingual sentiment transfer with limited resources

Structural Correspondence Learning for Cross-Lingual Sentiment Classification with One-to-Many Mappings