Abstract

AbstractWith the speedy growth of online shopping, it has become of crucial importance for product makers to analyze, and handle a wealth of products' reviews. However, such a high volume of reviews, along with a wide variety of opinions, makes it hard for manufacturers to know exactly how they can improve their products without having an efficient approach. For this purpose, the results of sentiment classification would help the customers to retrieve the necessary information to choose an appropriate product, and the sellers to effectively collect customer feedback in order to improve their products. Like most of the read‐world problems, the shopping review data being used in this study were imbalanced, being predominately composed of positive with only a small percentage of negative reviews. Machine learning (ML) algorithms do not perform well when data are imbalanced, as they tend to get biased toward the overrepresented data category. The synthetic minority over‐sampling technique (SMOTE) was used to address this class imbalance problem. In this study, three different ML‐based algorithms, namely the Naïve Bayes (NB), Support Vector Machine, and decision tree (DT) were employed. An extensive preprocessing procedure was taken to prepare the text datasets, and details are discussed in the manuscript. The performance analysis indicated that the DT algorithm outperforms the other two methods. As positive reviews account for the majority of the reviews, sparse words removal for the data resulted in the removal of almost all negative reviews' sentiments. Hence, the model training process is here performed on positive and negative reviews separately. A combination of the review titles with their contents, separate tokenization process, applications of various N‐gram, and maintaining stops words (e.g. “not” or “but”) were some other steps considered to improve the performance of the model.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.