Abstract

Previous research on sentiment analysis mainly focuses on binary or ternary sentiment analysis in monolingual texts. However, in today's social media such as micro-blogs, emotions are often expressed in bilingual or multilingual text called code-switching text, and people's emotions are complex, including happiness, sadness, angry, afraid, surprise, etc. Different emotions may exist together, and the proportion of each emotion in the code-switching text is often unbalanced. Inspired by the recently proposed BERT model, we investigate how to fine-tune BERT for multi-label sentiment analysis in code-switching text in this paper. Our investigation includes the selection of pre-trained models and the fine-tuning methods of BERT on this task. To deal with the problem of the unbalanced distribution of emotions, a method based on data augmentation, undersampling and ensemble learning is proposed to get balanced samples and train different multi-label BERT classifiers. Our model combines the prediction of each classifier to get the final outputs. The experiment on the dataset of NLPCC 2018 shared task 1 shows the effectiveness of our model for the unbalanced code-switching text. The F1-Score of our model is higher than many previous models.

Highlights

  • Sentiment Analysis is a common task in the field of Natural Language Processing (NLP)

  • With the release of BERT, many researchers use BERT based methods on sentiment analysis tasks, and the results show that BERT has greatly improved the performance of models in the field of sentiment analysis

  • BERT(M) FOR MULTI-LABEL SENTIMENT ANALYSIS We modify the standard BERT model for classification tasks to improve the performance of multi-label sentiment analysis in code-switching text

Read more

Summary

Introduction

Sentiment Analysis is a common task in the field of Natural Language Processing (NLP). The task aims to detect emotions in text. Previous studies mainly focus on binary (positive and negative) and ternary (positive, negative and neutral) sentiment analysis in monolingual text. In social media such as micro-blogs, emotions are often expressed by text in multiple languages, and the multilingual text is called code-switching text. Detecting emotions from code-switching text is a more complex task than detecting emotions from monolingual text. Analyzing only two or three kind of emotions is insufficient. Sad, angry, afraid or surprised, rather than just have two kind of emotions

Objectives
Methods
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.