Abstract

Domain adaptation tasks have raised much attention in recent years, especially, the task of cross-domain sentiment classification, and remarkable success has been achieved on specific domains with large amounts of labeled data. However, annotating enough data in each domain is still expensive and time-consuming, which will produce difficulty in the application of domain adaptation. In this paper, we proposed a Capsule network method with Identifying Transferable Knowledge (CITK) as common knowledge for cross-domain sentiment classification. CITK model uses capsule network to encode the intrinsic spatial part-whole relationship constituting domain invariant knowledge, which bridges the knowledge gap between the source and target domains. In addition, we use Bidirectional Encoder Representations from Transformers (BERT) to convert sentences to equal length, which is called pre-training, in order to obtain more complete semantic embedded representation, so that Significant Consistent Polarity (SCP) words can be more accurate. Extensive experiments are conducted to evaluate the effectiveness of the proposed CITK model on a real world data set of four domains. Experimental results demonstrate that CITK can significantly outperform the state-of-the-art methods for the cross-domain sentiment classification task.

Highlights

  • Sentiment classification is an important task in natural language processing (NLP) and is essential to understand user opinions in social networks or product reviews

  • We proposed a method in this paper to embed Significant Consistent Polarity (SCP) words as common knowledge into the capsule network to further solve the problem of cross-domain sentiment classification

  • 2) F-score for SCP words (Section V.B): In this experiment, the accuracy of extracting SCP words is verified through the index of F-score, which further explains the necessity of embedding transferable knowledge

Read more

Summary

Introduction

Sentiment classification is an important task in natural language processing (NLP) and is essential to understand user opinions in social networks or product reviews. This task aims to predict the overall sentiment polarity (e.g., positive or negative) of a data document. Polarity orientation (positive or negative) of a word to express an opinion often differs in different domains. A phrase like ‘‘take too long’’ is positive in the electronic domain, but negative in the restaurant domain. With the increasing number of domains, annotating data becomes a labor-intensive activity. For these reasons, we need a cross-domain sentiment analysis technology to accurately

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call