Methodology for Analyzing the Traditional Algorithms Performance of User Reviews Using Machine Learning Techniques

Abdul Karim,Samir Brahim Belhaouri,Maqsood Ahmad,Azhari Azhari,Ali Adil Qureshi

doi:10.3390/a13080202

Abstract

Android-based applications are widely used by almost everyone around the globe. Due to the availability of the Internet almost everywhere at no charge, almost half of the globe is engaged with social networking, social media surfing, messaging, browsing and plugins. In the Google Play Store, which is one of the most popular Internet application stores, users are encouraged to download thousands of applications and various types of software. In this research study, we have scraped thousands of user reviews and the ratings of different applications. We scraped 148 application reviews from 14 different categories. A total of 506,259 reviews were accumulated and assessed. Based on the semantics of reviews of the applications, the results of the reviews were classified negative, positive or neutral. In this research, different machine-learning algorithms such as logistic regression, random forest and naïve Bayes were tuned and tested. We also evaluated the outcome of term frequency (TF) and inverse document frequency (IDF), measured different parameters such as accuracy, precision, recall and F1 score (F1) and present the results in the form of a bar graph. In conclusion, we compared the outcome of each algorithm and found that logistic regression is one of the best algorithms for the review-analysis of the Google Play Store from an accuracy perspective. Furthermore, we were able to prove and demonstrate that logistic regression is better in terms of speed, rate of accuracy, recall and F1 perspective. This conclusion was achieved after preprocessing a number of data values from these data sets.

Highlights

In an information era where a large amount of data needs to be processed every day, minute and second—and the huge demand on computers with high processing speeds to outcome accurate results within nanoseconds, it is said that all approximately 2.5 quintillion bytes of data can be manually or automatically generated on a daily basis using different tools and application
In term frequency (TF)/inverse document frequency (IDF) base we showed that the logistic regression algorithm had a 0.621%
We evaluated the results by using different machine-learning algorithms like naïve Bayes, random forest and logistic regression algorithm that can check the semantics of reviews of some random forest and logistic regression algorithm that can check the semantics of reviews of some applications from users that their reviews were good, bad, average, etc

Summary

Introduction

In an information era where a large amount of data needs to be processed every day, minute and second—and the huge demand on computers with high processing speeds to outcome accurate results within nanoseconds, it is said that all approximately 2.5 quintillion bytes of data can be manually or automatically generated on a daily basis using different tools and application. This illustrates the importance of text-mining techniques in handling and classifying data in a meaningful way. We used various algorithms and text classification techniques using Android application reviews [2]

Methods

Findings

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Algorithms	Publication Date: Aug 18, 2020
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Methodology for Analyzing the Traditional Algorithms Performance of User Reviews Using Machine Learning Techniques

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms

Lead the way for us

Similar Papers

Utilizing grid search cross-validation with adaptive boosting for augmenting performance of machine learning models.
Muhammad Adnan ... Izaz Ur Rehman
PeerJ Computer Science | VOL. 8
Muhammad Adnan, et. al.Muhammad Adnan ... Izaz Ur Rehman
21 Feb 2022
PeerJ Computer Science | VOL. 8

Unveiling Exoplanets Through the Power of ML: A Comparative Analysis of RandomForest and Gaussian Models
Fatemeh Fazel ... Bernard Foing
-
Fatemeh Fazel, et. al.Fatemeh Fazel ... Bernard Foing
08 Mar 2024
08 Mar 2024

Fake News Detection Using Passive-Aggressive Classifier and Other Machine Learning Algorithms
K Nagashri ... J Sangeetha
-
K Nagashri, et. al.K Nagashri ... J Sangeetha
01 Jan 2020
01 Jan 2020

Confirming the statistically significant superiority of tree-based machine learning algorithms over their counterparts for tabular data.
Haohui Lu ... Nagarajan Raju
PLOS ONE | VOL. 19
Haohui Lu, et. al.Haohui Lu ... Nagarajan Raju
18 Apr 2024
PLOS ONE | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Methodology for Analyzing the Traditional Algorithms Performance of User Reviews Using Machine Learning Techniques

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Algorithms