Resolving Ambiguity in Sentiment Classification

Shuyuan Deng,Huimin Zhao,Atish P Sinha

doi:10.1145/3046684

Abstract

Sentiment analysis has become popular in business intelligence and analytics applications due to the great need for learning insights from the vast amounts of user generated content on the Internet. One major challenge of sentiment analysis, like most text classification tasks, is finding structures from unstructured texts. Existing sentiment analysis techniques employ the supervised learning approach and the lexicon scoring approach, both of which largely rely on the representation of a document as a collection of words and phrases. The semantic ambiguity (i.e., polysemy) of single words and the sparsity of phrases negatively affect the robustness of sentiment analysis, especially in the context of short social media texts. In this study, we propose to represent texts using dependency features. We test the effectiveness of dependency features in supervised sentiment classification. We compare our method with the current standard practice using a labeled data set containing 170,874 microblogging messages. The combination of unigram features and dependency features significantly outperformed other popular types of features.

Full Text