Cross-Domain Sentiment Classification With Bidirectional Contextualized Transformer Language Models

Batsergelen Myagmar,Jie Li,Shigetomo Kimura

doi:10.1109/access.2019.2952360

Batsergelen Myagmar, Jie Li + Show 1 more

Open Access

https://doi.org/10.1109/access.2019.2952360

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2019
Citations: 37	License type: CC BY 4.0

Affiliation: Shanghai Jiao Tong University, University of Tsukuba

Abstract

Cross-domain sentiment classification is an important Natural Language Processing (NLP) task that aims at leveraging knowledge obtained from a source domain to train a high-performance learner for sentiment classification on a target domain. Existing transfer learning methods applied on cross-domain sentiment classification mostly focus on inducing a low-dimensional feature representation shared across domains based on pivots and non-pivots, which is still a low-level representation of sequence data. Recently, there have been great progress in the NLP literature in developing high-level representation language models based on Transformer architecture, which are pre-trained on large text corpus and fine-tuned for specific task with an additional layer on top. Among such language models, the bidirectional contextualized Transformer language models of BERT and XLNet have greatly impacted NLP research field. In this paper, we fine-tune BERT and XLNet for the cross-domain sentiment classification. We then explore their transferability in the context of cross-domain sentiment classification through in-depth analysis of two models' performances and update the state-of-the-arts with a significant margin of improvement. Our results show that such bidirectional contextualized language models outperform the previous state-of-the-arts methods for cross-domain sentiment classification while using up to 120 times less data.

Highlights

With the user sentiment and opinion expressions becoming widespread throughout social and e-commerce platforms, correctly understanding these thoughts and views becomes important in facilitating various downstream applications [1]
Deep neural networks have been successfully applied for diverse machine learning problems, including various Natural Language Processing (NLP) tasks, with greatly improved prediction performance metrics
Recurrent Neural Networks (RNN)-based deep learning architectures has been the standard for various NLP tasks, including sentiment classification

Summary

INTRODUCTION

With the user sentiment and opinion expressions becoming widespread throughout social and e-commerce platforms, correctly understanding these thoughts and views becomes important in facilitating various downstream applications [1]. A pre-training technique, Bidirectional Encoder Representations from Transformers (BERT) [15], is proposed and has created state-of-the-art models for a wide variety of NLP tasks,including question answering (SQuAD v1.1), natural language inference, text classification and others. The latest of such pre-trained language models is XLNet [16], a generalized autoregressive pretraining method that enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order, and overcomes the limitations of BERT thanks to its autoregressive formulation.

RELATED WORKS

FINE-TUNING FOR CDSC

Findings

CONCLUSION AND FUTURE WORK