Microblog sentiment analysis aims at discovering the users' attitude of hot events. Difficulties of microblog sentiment analysis lie on the short length of text and lack of labeled corpora. Para2vec based on deep learning attracts people's attention recently and the low-dimensional paragraph vectors trained by para2vec get excellent results on sentiment analysis. But when applying it for sentiment analysis on microblogs, we find it does not work so well as on ordinary texts. In this paper, we analyse the weakness of microblog sentiment analysis based on paragraph vectors. And then, we propose two categories of methods, model extension and emotional tendency vectors, to improve the model para2vec. The experimental results confirmed the rationality of our methods. Data analysis shows that our improved methods can effectively reduce the adverse effects of the short text and greatly improve the accuracy of sentiment analysis.
Read full abstract