A Closer Look at Machine Learning Effectiveness in Android Malware Detection

Filippos Giannakas,Vasileios Kouliaridis,Georgios Kambourakis

doi:10.3390/info14010002

Filippos Giannakas, Vasileios Kouliaridis + Show 1 more

Open Access

https://doi.org/10.3390/info14010002

Copy DOI

Journal: Information	Publication Date: Dec 21, 2022
Citations: 5	License type: CC BY 4.0

Affiliation: University of the Aegean

Abstract

Nowadays, with the increasing usage of Android devices in daily life activities, malware has been increasing rapidly, putting peoples’ security and privacy at risk. To mitigate this threat, several researchers have proposed different methods to detect Android malware. Recently, machine learning based models have been explored by a significant mass of researchers checking for Android malware. However, selecting the most appropriate model is not straightforward, since there are several aspects that must be considered. Contributing to this domain, the current paper explores Android malware detection from diverse perspectives; this is achieved by optimizing and evaluating various machine learning algorithms. Specifically, we conducted an experiment for training, optimizing, and evaluating 27 machine learning algorithms, and a Deep Neural Network (DNN). During the optimization phase, we performed hyperparameter analysis using the Optuna framework. The evaluation phase includes the measurement of different performance metrics against a contemporary, rich dataset, to conclude with the most accurate model. The best model was further interpreted by conducting feature analysis, using the Shapley Additive Explanations (SHAP) framework. Our experiment results showed that the best model is the DNN consisting of four layers (two hidden), using the Adamax optimizer, as well as the Binary Cross-Entropy (loss), and the Softsign activation functions. The model succeeded with 86% prediction accuracy, while the balanced accuracy, the F1-score, and the ROC-AUC metrics were at 82%.

Full Text