Classification of Shopify App User Reviews Using Novel Multi Text Features

Furqan Rustam,Muhammad Ahmad,Dost Muhammad Khan,Gyu Sang Choi,Saleem Ullah,Arif Mehmood

doi:10.1109/access.2020.2972632

Abstract

App stores usually allow users to give reviews and ratings that are used by developers to resolve issues and make plans for their apps. In this way, these app stores collect large amounts of data for analysis. However, there are several challenges that must first be addressed, related to redundancy and the volume of data, by using machine learning. This study performs experiments on a dataset that contains reviews for Shopify apps. To overcome the aforementioned limitations, we first categorize user reviews into two groups, i.e., happy and unhappy, and then perform preprocessing on the reviews to clean the data. At a later stage, several feature engineering techniques, such as bag-of-words, term frequency-inverse document frequency (TF-IDF), and chi-square (Chi2), are used singly and in combination to preserve meaningful information. Finally, the random forest, AdaBoost classifier, and logistic regression models are used to classify the reviews as happy or unhappy. The performance of our proposed pipeline was evaluated using average accuracy, precision, recall, and f <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> score. The experiments reveal that a combination of features can improve machine learning models performance and in this study, logistic regression outperforms the others and achieves an 83% true acceptance rate when combined with TF-IDF and Chi2.

Highlights

Manufacturers always want to know the success rate of their products/apps, and for that, they usually request users to provide feedback that is later used to analyze the impact and quality of their products [1], [2]
logistic regression (LR) performs significantly better in the case of classification, but LR is usually preferred by researchers when there is a binary classification problem
We compare the results of two treebased ensemble algorithms, random forest (RF), and AdaBoost classifier (AC), with a statistical algorithm, LR

Summary

Introduction

Manufacturers always want to know the success rate of their products/apps, and for that, they usually request users to provide feedback that is later used to analyze the impact and quality of their products [1], [2]. The work [4] built a mobile app review analyzer that automatically extracts user requests or suggestions from reviews. The work [5] presented some probabilistic techniques for classifying app reviews They classified these reviews into four categories: ratings, bug reports, feature requests, and user experiences. They used multiple binary classifiers to classify reviews and achieve acceptable results. The work [6] used different machine learning algorithms to solve app review classification problems. They performed a comparative analysis of the results of

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 90	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Classification of Shopify App User Reviews Using Novel Multi Text Features

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Editor's evaluation: Derivation and external validation of clinical prediction rules identifying children at risk of linear growth faltering
Eduardo Franco
-
Eduardo FrancoEduardo Franco
05 Sep 2022
05 Sep 2022

Decision letter: Derivation and external validation of clinical prediction rules identifying children at risk of linear growth faltering
Andrew N Mertens ... Eduardo Franco
-
Andrew N Mertens, et. al.Andrew N Mertens ... Eduardo Franco
05 Sep 2022
05 Sep 2022

Author response: Derivation and external validation of clinical prediction rules identifying children at risk of linear growth faltering
Sharia M Ahmed ... Sayeeda Huq
-
Sharia M Ahmed, et. al.Sharia M Ahmed ... Sayeeda Huq
21 Dec 2022
21 Dec 2022

Establishment of models to predict factors influencing periodontitis in patients with type 2 diabetes mellitus.
Hong-Miao Xu ... Jia Liu
World journal of diabetes | VOL. 14
Hong-Miao Xu, et. al.Hong-Miao Xu ... Jia Liu
15 Dec 2023
World journal of diabetes | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Classification of Shopify App User Reviews Using Novel Multi Text Features

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access