Sentiment classification model for Chinese micro-blog comments based on key sentences extraction

Shunxiang Zhang,Guangli Zhu,Kuan-Ching Li,Zhaoya Hu,Ming Jin

doi:10.1007/s00500-020-05160-8

Abstract

With the advancements of communication technology, the growth in the number of blogs has been remarkable. Information sharing has been an interesting matter of interest for researches, as people contribute to information shared through the Internet. Such sharing is common in the Chinese language, as approximately 15% of the world’s population are native speakers of Mandarin. Due to comments shared that may contain complex sentences in such micro-blogs, the result of sentiment classification may be affected, reducing its accuracy. Aimed at this point, a Sentiment Classification model of Chinese Micro-blog Comments based on Key Sentences (SC-CMC-KS) is proposed. The key sentence extraction algorithm for Chinese Microblog Comments is presented by considering three factors to recognize the key sentences of a given comment: the sentiment attributes, location attributes, and the critical feature word attributes. Besides, a computing algorithm of sentence sentiment value that integrates both dependency relationships and multi-rules (i.e., sentence-type rule and inter-sentence rule) is designed, as well defined a modification distance describing the relationship between modification words and core words, in which the sentiment value in the phrase level is computed according to the calculation rules so that the sentiment value in sentence level is obtained based on the multi-rules. Furthermore, the sentiment classification algorithm of micro-blog comments is presented, so that the key sentences and emoticon of the complete micro-blog are weighted to compute the final sentiment value, and comments are classified according to the threshold set. Experimental results show the effectiveness of this model yet promising.

Full Text