Pretraining Sentiment Classifiers with Unlabeled Dialog Data

Toru Shimizu,Nobuyuki Shimizu,Hayato Kobayashi

doi:10.18653/v1/p18-2121

Toru Shimizu, Nobuyuki Shimizu + Show 1 more

Open Access

PDF Available

https://doi.org/10.18653/v1/p18-2121

Copy DOI

Export

Save

Cite

Publication Date: Jan 1, 2018
Citations: 8	License type: cc-by

Affiliation: Yahoo (United Kingdom)

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

The huge cost of creating labeled training data is a common problem for supervised learning tasks such as sentiment classification. Recent studies showed that pretraining with unlabeled data via a language model can improve the performance of classification models. In this paper, we take the concept a step further by using a conditional language model, instead of a language model. Specifically, we address a sentiment classification task for a tweet analysis service as a case study and propose a pretraining strategy with unlabeled dialog data (tweet-reply pairs) via an encoder-decoder model. Experimental results show that our strategy can improve the performance of sentiment classifiers and outperform several state-of-the-art strategies including language model pretraining.

Highlights

Sentiment classification is a task to predict a sentiment label, such as positive/negative, for a given text and has been applied to many domains such as movie/product reviews, customer surveys, news comments, and social media
Dai and Le (2015) recently proposed a semi-supervised sequence learning framework, where a sentiment classifier based on recurrent neural networks (RNNs) is trained with labeled data after initializing it with the parameters of an RNN-based language model pretrained with a large amount of unlabeled data
We report on a case study based on a costly labeled sentiment dataset of 99.5K items and a large-scale unlabeled dialog dataset of 22.3M, which were provided from a tweet analysis service (Section 3.1)

Summary

Introduction

Sentiment classification is a task to predict a sentiment label, such as positive/negative, for a given text and has been applied to many domains such as movie/product reviews, customer surveys, news comments, and social media. A common problem of this task is the lack of labeled training data due to costly annotation work, especially for social media without explicit sentiment feedback such as review scores To overcome this problem, Dai and Le (2015) recently proposed a semi-supervised sequence learning framework, where a sentiment classifier based on recurrent neural networks (RNNs) is trained with labeled data after initializing it with the parameters of an RNN-based language model pretrained with a large amount of unlabeled data. People tend to write angry responses to angry messages, empathetic replies to sad remarks, or congratulatory phrases to good news

Methods

Results

Conclusion