Abstract

ABSTRACTSentiment analysis research has increased tremendously in recent times due to the wide range of business and social applications. Motivation behind sentiment analysis is that it provides companies’ methods to determine the product acceptance and ways to improve its quality. It also helps users to take purchasing decisions. Various parsing schemes/feature extraction methods have been proposed in the literature to process unstructured text to extract patterns that may help machine learning model to learn. The main limitation of the existing feature extraction techniques is the sparseness of the data and inability to incorporate semantic information. In this paper, a new feature extraction method is proposed, namely clustering features. Proposed feature extraction technique focuses on alleviating the data sparsity faced by supervised sentiment analysis by clustering of semantic features. Proposed clustering features are capable of including semantic information and alleviating data sparseness for machine learning algorithm. In all the experiments, support vector machine and Boolean Multinomial Naive Bayes (BMNB) machine learning algorithms are used for classification. Experimental results show that the proposed clustering features significantly outperform other features for document-level sentiment classification. All the experiments are performed on standard movie review data-set and product review data-sets, namely book, electronics, kitchen appliances.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.