Abstract

Short text sentiment analysis is challenging because short texts are limited in length and lack context. Short texts are usually rather ambiguous because of polysemy and the typos these texts contain. Polysemy is the coexistence of multiple word meanings and commonly appears in every language. Various uses of a word may assign the word both positive and negative meanings. In previous studies, the variability of words is often ignored, which may cause analysis errors. In this study, to resolve this problem, we proposed a novel text representation model named Word2PLTS for short text sentiment analysis by introducing probabilistic linguistic terms sets (PLTSs) and the relevant theory. In this model, every word is represented as a PLTS that fully describes the possibilities for the sentiment polarity of the word. Then, by using support vector machines (SVM), a novel sentiment analysis and polarity classification framework named SACPC is obtained. This framework is a technique that combines supervised learning and unsupervised learning. We compare SAPCP to lexicon-based approaches and machine learning approaches on three benchmark datasets. A noticeable improvement in performance is achieved. To further verify the superiority of SAPCP, state of the art performance comparisons are conducted, and the results are impressive.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call