Machine Learning Classification Algorithms for Adware in Android Devices: A Comparative Evaluation and Analysis

Joseph Yisa Ndagi,John K Alhassan

doi:10.1109/icecco48375.2019.9043288

Abstract

Exponential growth experienced in Internet usage has paved the way to exploit users of the Internet, a phishing attack is one of the means that can be used to obtained victim confidential details unwittingly across the Internet. A high false-positive rate and low accuracy have been a setback in phishing detection. In this research 17 different supervised learning techniques such as RandomForest, Systematically Developed Forest (SysFor), Spectral Areas and Ratios Classifier (SPAARC), Reduces Error Pruning Tree (RepTree), RandomTree, Logic Model Tree (LMT), Forest by Penalizing Attributes (ForestPA), JRip, PART, Nearest Neighbor with Generalization (NNge), One Rule (OneR), AdaBoostM1, RotationForest, LogitBoost, RseslibKnn, Library for Support Vector Machine (LibSVM), and BayesNet were employed to achieve the comparative analysis of machine classifier. The performance of the classifier algorithms was rated using Accuracy, Precision, Recall, F-Measure, Root Mean Squared Error, Receiver Operation Characteristics Area, Root Relative Squared Error False Positive Rate and True Positive Rate using WEKA data mining tool. The research revealed that quite several classifiers also exist which if properly explored will yield more accurate results for phishing detection. RandomForest was found to be an excellent classifier that gives the best accuracy of 0.9838 and a false positive rate of 0.017. The comparative analysis result indicates the achievement of low false-positive rate for phishing classification which suggests that anti-phishing application developer can implement the machine learning classification algorithm that was discovered to be the best in this study to enhance the feature of phishing attack detection and classification.

Full Text