Abstract

There is remarkable progress in the research of Twitter sentiment analysis (TSA) which is a technique of extracting opinion by automatically processing digital data. In this article, we propose a feature-based TSA system in conjunction with improved negation accounting by leveraging different types of features such as lexicon-based, morphological, POS-based, n-gram features, and many more, which would be used for classifier training and have the strong impact on polarity determination. We use three different state-of-the-art classifiers such as support vector machine (SVM), Naive Bayesian, and decision tree, and the series of experiments are conducted to determine which classifier works well with which feature group. In addition, this work focuses on investigating a significant linguistic phenomenon called negation which can either change polarity or strength of polarity of opinionated words. To enhance the classification performance, an algorithm is also developed to handle those negation tweets in which the presence of negation does not necessarily mean negation. The proposed feature-based Twitter system with negation accounting is evaluated on the benchmark Twitter data set SemEval-2013 Task 2. The experimental results demonstrate that the SVM classifier outperforms the other classifiers and the state-of-the-art TSA system developed by the NRC Canada winning team of SemEval-2013 Task 2. In addition, extensive experiments are also conducted to demonstrate that the proposed negation strategy with incorporated negation exception rules provides a substantial improvement by preventing misclassification of tweets. Finally, impact of each preprocessing module on classification performance is presented.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call