Abstract

Extracting semantic polarity of Chinese Multiword Expression, especially some newly generated Multiword Expression from internet(such as weibo or microblog), is an important task for sentiment analysis of web texts or other real word text as some Multiword Expressions can express more integrative sentiments than words units. This paper proposes a method that contains a novel latent discriminative algorithm, which attempts to attack this problem by integrating discriminative model and latent value model. Although Chinese Multiword Expressions consist of multiple words, the semantic polarity of the Multiword Expression is not just simple integration of polarities of the component words, as some words can invert the affective polarity so the Multiword Expressions can have totally opposite semantic polarity, such as ironic texts. In order to capture the property of such Multiword Expressions, hidden semi-CRF which includes a latent valuable layer, which can be used to address dual-sequence labeling tasks synchronously, is adopted. The method is tested experimentally by adopting a manually labeled set of positive and negative Multiword Expressions from microblog or other internet resources, and the experiments have shown very promising results, which is comparable to the best value ever reported.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call