Abstract

Sentiment lexicon is an important tool for identifying the sentiment polarity of words and texts. How to automatically construct sentiment lexicons has become a research topic in the field of sentiment analysis and opinion mining. Recently there were some attempts to employ representation learning algorithms to construct a sentiment lexicon with sentiment-aware word embedding. However, these methods were normally trained under document-level sentiment supervision. In this paper, we develop a neural architecture to train a sentiment-aware word embedding by integrating the sentiment supervision at both document and word levels, to enhance the quality of word embedding as well as the sentiment lexicon. Experiments on the SemEval 2013-2016 datasets indicate that the sentiment lexicon generated by our approach achieves the state-of-the-art performance in both supervised and unsupervised sentiment classification, in comparison with several strong sentiment lexicon construction methods.

Highlights

  • IntroductionSentiment lexicon is a set of words (or phrases) each of which is assigned with a sentiment polarity score

  • Sentiment lexicon is a set of words each of which is assigned with a sentiment polarity score

  • We proposed to construct sentiment lexicons based on a sentiment-aware word representation learning approach

Read more

Summary

Introduction

Sentiment lexicon is a set of words (or phrases) each of which is assigned with a sentiment polarity score. The sentiment lexicons generated by their approach obtained better performance to predict the tweet sentiment labels, in comparison with the PMI-based method (Mohammad et al, 2013) These supervised learning methods can to some extent exploit the sentiment labeling information in the texts and can learn a sentiment-aware word embedding, the manner of using document-level sentiment supervision suffers from some complex linguistic phenomena such as negation, transition and comparative degree, and unable to capture the fine-grained sentiment information in the text. The embeddings of words are summed up to represent the document, and the word “like” will be falsely associated with the negative sentiment label Such linguistic phenomena occur frequently in review texts, and makes sentiment-aware word representation learning less effective. Our approach obtains the state-of-the-art performance in comparison with several strong sentiment lexicon construction methods, on the benchmark SemEval 2013-2016 datasets for twitter sentiment classification

Related Work
Our Approach
Learning Word-Level Sentiment Supervision
From Sentiment Representation to Sentiment Lexicon
Datasets and Settings
Word-level Sentimnt Annotation
Tuning the Parameter α
Lexicon Analysis
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call