Towards a Universal Sentiment Classifier in Multiple languages

Kui Xu,Xiaojun Wan

doi:10.18653/v1/d17-1053

Abstract

Existing sentiment classifiers usually work for only one specific language, and different classification models are used in different languages. In this paper we aim to build a universal sentiment classifier with a single classification model in multiple different languages. In order to achieve this goal, we propose to learn multilingual sentiment-aware word embeddings simultaneously based only on the labeled reviews in English and unlabeled parallel data available in a few language pairs. It is not required that the parallel data exist between English and any other language, because the sentiment information can be transferred into any language via pivot languages. We present the evaluation results of our universal sentiment classifier in five languages, and the results are very promising even when the parallel data between English and the target languages are not used. Furthermore, the universal single classifier is compared with a few cross-language sentiment classifiers relying on direct parallel data between the source and target languages, and the results show that the performance of our universal sentiment classifier is very promising compared to that of different cross-language classifiers in multiple target languages.

Highlights

Nowadays, a large amount of user-generated content (UGC) appears online everyday, such as tweets, comments and product reviews
The Bilingual Model (BM) model relies on the direct parallel data between the source and target languages, and it generally works slightly better than the other models, including the PMDB model and the UMM model
The results demonstrate that the pivot-driven model is very effective for learning bilingual / trilingual sentiment-aware word embeddings

Summary

Introduction

A large amount of user-generated content (UGC) appears online everyday, such as tweets, comments and product reviews. Sentiment classification on these data has become a popular research topic over the past few years (Pang et al., 2002; Blitzer et al, 2007; Agarwal et al, 2011; Liu, 2012). Most existing sentiment classifiers rely on labeled training data and the data are usually language-dependent. Labeled training data for sentiment classification are not available or not easy to obtain in many languages in the world (e.g., Malaysian, Mongolian, Uighur). It is hard to build a sentiment classifier in these resource-poor languages

Objectives

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Towards a Universal Sentiment Classifier in Multiple languages

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2017
Citations: 39	License type: cc-by

Similar Papers

Experts Versus All-Rounders: Target Language Extraction for Multiple Target Languages
Marvin Borsdorf ... Tanja Schultz
-
Marvin Borsdorf, et. al.Marvin Borsdorf ... Tanja Schultz
23 May 2022
23 May 2022

Synchronous Inference for Multilingual Neural Machine Translation
Qian Wang ... Jiajun Zhang
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 30
Qian Wang, et. al.Qian Wang ... Jiajun Zhang
01 Jan 2021
IEEE/ACM Transactions on Audio, Speech, and Language Processing | VOL. 30

Using Communities of Words Derived from Multilingual Word Vectors for Cross-Language Information Retrieval in Indian Languages
Paheli Bhattacharya ... Pawan Goyal
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 18
Paheli Bhattacharya, et. al.Paheli Bhattacharya ... Pawan Goyal
17 Dec 2018
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 18

Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair Prediction
...
-
, et. al. ...
21 Oct 2021
21 Oct 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Towards a Universal Sentiment Classifier in Multiple languages

Abstract

Highlights

Summary

Talk to us

Similar Papers