Abstract

Smartphones and mobile tablets play significant roles in daily life and have led to an increase in the number of users of this technology. The rising number of mobile device end-users has resulted in the generation of malware by hackers. Thus, mobile devices are becoming vulnerable to malware. Machine learning plays an important role in the detection of mobile malware applications. In this study, we focus on static analysis for Android malware detection. The ultimate goal of this research is to find out the symmetric features across the malware Android application to easily detect them. Many state-of-the-art methods focus on extracting asymmetric patterns of the category of features, e.g., application permissions to distinguish the malware application from the benign application. In this work, we propose a compromise by considering different types of static features and select the most important features that affect the detection process. These features represent the symmetric pattern to be used for the classification task. Inspired by TF-IDF, we propose a novel method of feature selection. Moreover, we propose a new method for merging the Android application URLs into a single feature called the URL_score. Several linear machine learning classifiers are utilized to evaluate the proposed method. The proposed methods significantly reduce the feature space, i.e., the symmetric pattern, of the Android application dataset and the memory size of the final model. In addition, the proposed model achieves the highest reported accuracy for the Drebin dataset to date. Based on the evaluation results, the linear support vector machine achieves an accuracy of 99%.

Highlights

  • Smart phones play a vital role in our daily life and are widely used for many purposes, e.g., web browsing, online banking, online learning, social networking, etc

  • We present a comprehensive static analysis to evaluate the effectiveness of Machine learning (ML)

  • Inspired by the term frequency (TF) and inverse document frequency (IDF) techniques [39], we propose a frequency-based feature selection method called the feature frequency-application frequency (FF − AF) method

Read more

Summary

Introduction

Smart phones play a vital role in our daily life and are widely used for many purposes, e.g., web browsing, online banking, online learning, social networking, etc. Due to the very high growth in the use of Android smartphones and the openness of the Android platform, Android smartphones are increasingly targeted by attackers and infected with malicious software [2,3]. We discuss how URLs are treated as features in the existing malware detection methods. There are three main types of Android malware detection, namely, static, dynamic, and hybrid analysis. The static analysis task includes extracting the static features from the the source code and manifest file of the application. These features that discriminate the malicious Android application do not change. The Android application that can send expensive messages without the user’s interaction is suspected as a malicious application [10]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call