Abstract

The vast majority of today’s mobile malware targets Android devices. An important task of malware analysis is the classification of malicious samples into known families. In this paper, we propose AndroDFA (DFA, detrended fluctuation analysis): an approach to Android malware family classification based on dynamic analysis of resource consumption metrics available from the proc file system. These metrics can be easily measured during sample execution. From each malware, we extract features through detrended fluctuation analysis (DFA) and Pearson’s correlation, then a support vector machine is employed to classify malware into families. We provide an experimental evaluation based on malware samples from two datasets, namely Drebin and AMD. With the Drebin dataset, we obtained a classification accuracy of 82%, comparable with works from the state-of-the-art like DroidScribe. However, compared to DroidScribe, our approach is easier to reproduce because it is based on publicly available tools only, does not require any modification to the emulated environment or Android OS, and by design, can also be used on physical devices rather than exclusively on emulators. The latter is a key factor because modern mobile malware can detect the emulated environment and hide its malicious behavior. The experiments on the AMD dataset gave similar results, with an overall mean accuracy of 78%. Furthermore, we made the software we developed publicly available, to ease the reproducibility of our results.

Highlights

  • The relentless growth of smartphone sales and their pervasiveness in our daily lives have fostered the development of malicious software targeting mobile devices

  • We present AndroDFA (DFA, detrended fluctuation analysis), a methodology that relies on dynamic analysis, and propose an architecture for implementing it, which automatically executes Android apps and generates stimuli to simulate user inputs

  • It was clear that our approach was very robust for detecting malware belonging to the RunMMS or Koler families, while malware from the MSeg or the Mtk families was more difficult to detect with frequent confusion among them

Read more

Summary

Introduction

The relentless growth of smartphone sales and their pervasiveness in our daily lives have fostered the development of malicious software targeting mobile devices. Static approaches do not require the execution of samples under analysis and can potentially reveal all the sample’s execution paths They are not very effective against obfuscation techniques, as extensively demonstrated in [6], and cannot track malware self-modifications at runtime or generated network traffic [2]. DroidScribe relies on CopperDroid, an emulator for Android apps, which is not publicly available and can be only accessed through an online service It is not suitable for batch experiments, since it can take as input just one sample at a time, and the submission procedure cannot be automated because it requires an anti-bot challenge-response test. AndroDFA can instead run by design on both emulated and real smartphones because it only relies on the proc file system This is a key factor as modern mobile malware can detect the emulated environment and hide its malicious behavior.

Background
Detrended Fluctuation Analysis
Pearson’s Correlation Coefficient
Mutual Information
Principal Component Analysis
Support Vector Machines
Related Work
Family Classification Methodology
Fingerprint Generation
Classification and Training
Architecture and Prototype Implementation
Architecture
Prototype Implementation
Performing Analysis on Real Devices
Datasets
Experimental Setup
Time Requirements
Stability of the DFA exponent
SVM Training and Test
Results
Comparison with DroidScribe
Experiment on the AMD Dataset
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call