Abstract

With the rapid expansion of e-commerce over the decades, more and more product reviews emerge on e-commerce sites. In order to effectively utilize the information available in the form of reviews, an automatic opinion mining system is needed to organize the reviews and to help the users and organizations in making an informed decision about the products. Opinion mining systems based on machine learning approaches are used to categorize the reviews containing the customer opinion into positive or negative reviews. In this paper we explore this new research area of applying a hybrid combination of machine learning approaches tied with principal component analysis as a feature reduction technique. We introduce two hybrid ensemble based models (i.e. bagging and bayesian boosting based) for opinion classification. The results are compared with two individual classifier models based on statistical learning (i.e. logistic regression and support vector machine) using a dataset of product reviews. The other objective is to compare the influence of using different n-gram schemes (unigrams, bigrams and trigrams). We found that ensemble based hybrid methods perform better in terms of various quality measures in classifying the opinion into positive and negative reviews. We also applied a pairwise statistical test to compare the significance of the classifiers.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call