Abstract

With the growing popularity of big data analytics in the area of online product review, the biggest issue is voluminous data. Sentiment analysis and opinion mining are useful for solving text and web based issues. For sentiment analysis, this work makes use of the Hadoop framework. The Hadoop is not only reliable but also a fault immune model for processing huge amounts of data. There is a critical role that is played by sentiment analysis in text mining purposes such as in consumer attitude recognition, trade name and product spotting, customer relationship management, and market research. Data is labelled either as subjective or objective based on the subjectivity classification. This subjective classification is further divided as positive, negative or neutral by sentiment classification. The sentiment is classified based on the features which are taken from the data. As feature selection contributes in conserving the classification expense with regard to time and computation load, feature selection has gained a lot of prominence. This work uses the Term Frequency (TF) feature extraction. The objective here is using feature selection based on information Gain (IG) and Particle Swarm Optimization (PSO) for feature selection in sentiment classification. These schemes can decrease the features in the original set as they eliminate redundant features for text sentiment categorization and thus improvise the accuracy of classification. Also, the running time of the learning algorithms is decreased. K-nearest neighbour (KNN) classifier is used for evaluating the suggested scheme. It has been shown by empirical outcomes that compared to the IG based feature selection; the PSO based feature selection scheme attains better and more robust performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call