Abstract

In this paper, we propose a joint segmentation and classification framework for sentiment analysis. Existing sentiment classification algorithms typically split a sentence as a word sequence, which does not effectively handle the inconsistent sentiment polarity between a phrase and the words it contains, such as “not bad” and “a great deal of ”. We address this issue by developing a joint segmentation and classification framework (JSC), which simultaneously conducts sentence segmentation and sentence-level sentiment classification. Specifically, we use a log-linear model to score each segmentation candidate, and exploit the phrasal information of top-ranked segmentations as features to build the sentiment classifier. A marginal log-likelihood objective function is devised for the segmentation model, which is optimized for enhancing the sentiment classification performance. The joint model is trained only based on the annotated sentiment polarity of sentences, without any segmentation annotations. Experiments on a benchmark Twitter sentiment classification dataset in SemEval 2013 show that, our joint model performs comparably with the state-of-the-art methods.

Highlights

  • Sentiment classification, which classifies the sentiment polarity of a sentence as positive or negative, is a major research direction in the field of sentiment analysis (Pang and Lee, 2008; Liu, 2012; Feldman, 2013)

  • We show that the joint model yields comparable performance with the state-of-the-art methods on the benchmark Twitter sentiment classification datasets in SemEval 2013

  • The reason is that when a larger K is used, (1) at training time, the sentiment classifier is built by using more phrasal information from multiple segmentations, which benefits from the ensembles; (2) at test time, the joint model considers several topranked segmentations and get the final sentiment polarity through voting

Read more

Summary

Introduction

Sentiment classification, which classifies the sentiment polarity of a sentence (or document) as positive or negative, is a major research direction in the field of sentiment analysis (Pang and Lee, 2008; Liu, 2012; Feldman, 2013). Timent classification as a special case of text categorization task Under this perspective, previous studies typically use pipelined methods with two steps. Previous studies typically use pipelined methods with two steps They first produce sentence segmentations with separate text analyzers (Choi and Cardie, 2008; Nakagawa et al, 2010; Socher et al, 2013b) or bag-of-words (Paltoglou and Thelwall, 2010; Maas et al, 2011). The major disadvantage of a pipelined method is the problem of error propagation, since sentence segmentation errors cannot be corrected by the sentiment classification model. The segmentations based on bag-of-words or syntactic chunkers are not effective enough to handle the polarity inconsistency phenomenons. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 477–487, October 25-29, 2014, Doha, Qatar. c 2014 Association for Computational Linguistics

Update
Related Work
The Proposed Approach
Task Definition
Segmentation Candidate Generation
Segmentation Ranking Model
Feature
Classification Model
Dataset and Experiment Settings
Baseline Methods
Results and Analysis
Comparing Joint and Pipelined Models
Effect of the beam size N
Effect of the top-ranked segmentation number K
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call