Android Malware Familial Classification Based on DEX File Section Features

Yong Fang,Lei Zhang,Yangchen Gao,Fan Jing

doi:10.1109/access.2020.2965646

Yong Fang, Lei Zhang + Show 2 more

Open Access

https://doi.org/10.1109/access.2020.2965646

Copy DOI

Abstract

The rapid proliferation of Android malware is challenging the classification of the Android malware family. The traditional static method for classification is easily affected by the confusion and reinforcement, while the dynamic method is expensive in computation. To solve these problems, this paper proposes an Android malware familial classification method based on Dalvik Executable (DEX) file section features. First, the DEX file is converted into RGB (Red/Green/Blue) image and plain text respectively, and then, the color and texture of image and text are extracted as features. Finally, a feature fusion algorithm based on multiple kernel learning is used for classification. In this experiment, the Android Malware Dataset (AMD) was selected as the sample set. Two different comparative experiments were set up, and the method in this paper was compared with the common visualization method and feature fusion method. The results show that our method has a better classification effect with precision, recall and F1 score reaching 0.96. Besides, the time of feature extraction in this paper is reduced by 2.999 seconds compared with the method of frequent subsequence. In conclusion, the method proposed in this paper is efficient and precise in the classification of the Android malware family.

Highlights

According to the Android malware 2018 special report [1], 360 Internet Security Center intercepted 4.342 million new samples of Android Malware in 2018, averaging about 12,000 new samples per day
The grayscale image has too single characteristics. To overcome this shortcomings of existing methods mentioned above, this paper proposes a familial classification method of Android malware based on Dalvik Executable (DEX) file section features
The traditional static analysis method is affected by the application of reinforcement and confusion, the dynamic analysis method requires too much time and space, and the features extracted from the existing visualization method is simple

Summary

Introduction

According to the Android malware 2018 special report [1], 360 Internet Security Center intercepted 4.342 million new samples of Android Malware in 2018, averaging about 12,000 new samples per day. Android malware is growing rapidly, nearly three times as fast as two years ago. There are many variants in the same Android malware family, which is destroying the security of mobile intelligent terminals seriously. The detection and family classification of Android malware is important. Traditional Android malware classification methods mainly include static analysis methods, dynamic analysis methods, and their combined form. Traditional static analysis methods classify malware mainly based on the static features of Android application package (APK) files. Zhou et al [3] proposed ‘‘DroidMoss’’. This method decomposes the DEX file of Android application software into Dalvik bytecode and calculates the fuzzy hash of bytecode to determine whether the Android application is

Methods

Findings

Conclusion