Abstract

Extracting semantic polarity of Chinese Multiword Expression, especially some newly generated Multiword Expression from internet(such as weibo or microblog), is an important task for sentiment analysis of web texts or other real word text as some Multiword Expressions can express more integrative sentiments than words units. This paper proposes a method that contains a novel latent discriminative algorithm, which attempts to attack this problem by integrating discriminative model and latent value model. Although Chinese Multiword Expressions consist of multiple words, the semantic polarity of the Multiword Expression is not just simple integration of polarities of the component words, as some words can invert the affective polarity so the Multiword Expressions can have totally opposite semantic polarity, such as ironic texts. In order to capture the property of such Multiword Expressions, hidden semi-CRF which includes a latent valuable layer, which can be used to address dual-sequence labeling tasks synchronously, is adopted. The method is tested experimentally by adopting a manually labeled set of positive and negative Multiword Expressions from microblog or other internet resources, and the experiments have shown very promising results, which is comparable to the best value ever reported.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.