Abstract

App stores usually allow users to give reviews and ratings that are used by developers to resolve issues and make plans for their apps. In this way, these app stores collect large amounts of data for analysis. However, there are several challenges that must first be addressed, related to redundancy and the volume of data, by using machine learning. This study performs experiments on a dataset that contains reviews for Shopify apps. To overcome the aforementioned limitations, we first categorize user reviews into two groups, i.e., happy and unhappy, and then perform preprocessing on the reviews to clean the data. At a later stage, several feature engineering techniques, such as bag-of-words, term frequency-inverse document frequency (TF-IDF), and chi-square (Chi2), are used singly and in combination to preserve meaningful information. Finally, the random forest, AdaBoost classifier, and logistic regression models are used to classify the reviews as happy or unhappy. The performance of our proposed pipeline was evaluated using average accuracy, precision, recall, and f <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> score. The experiments reveal that a combination of features can improve machine learning models performance and in this study, logistic regression outperforms the others and achieves an 83% true acceptance rate when combined with TF-IDF and Chi2.

Highlights

  • Manufacturers always want to know the success rate of their products/apps, and for that, they usually request users to provide feedback that is later used to analyze the impact and quality of their products [1], [2]

  • logistic regression (LR) performs significantly better in the case of classification, but LR is usually preferred by researchers when there is a binary classification problem

  • We compare the results of two treebased ensemble algorithms, random forest (RF), and AdaBoost classifier (AC), with a statistical algorithm, LR

Read more

Summary

Introduction

Manufacturers always want to know the success rate of their products/apps, and for that, they usually request users to provide feedback that is later used to analyze the impact and quality of their products [1], [2]. The work [4] built a mobile app review analyzer that automatically extracts user requests or suggestions from reviews. The work [5] presented some probabilistic techniques for classifying app reviews They classified these reviews into four categories: ratings, bug reports, feature requests, and user experiences. They used multiple binary classifiers to classify reviews and achieve acceptable results. The work [6] used different machine learning algorithms to solve app review classification problems. They performed a comparative analysis of the results of

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.