Android Malware Family Classification and Analysis: Current Status and Future Directions
Android receives major attention from security practitioners and researchers due to the influx number of malicious applications. For the past twelve years, Android malicious applications have been grouped into families. In the research community, detecting new malware families is a challenge. As we investigate, most of the literature reviews focus on surveying malware detection. Characterizing the malware families can improve the detection process and understand the malware patterns. For this reason, we conduct a comprehensive survey on the state-of-the-art Android malware familial detection, identification, and categorization techniques. We categorize the literature based on three dimensions: type of analysis, features, and methodologies and techniques. Furthermore, we report the datasets that are commonly used. Finally, we highlight the limitations that we identify in the literature, challenges, and future research directions regarding the Android malware family.
- Research Article
120
- 10.1109/tdsc.2017.2739145
- Oct 23, 2019
- IEEE Transactions on Dependable and Secure Computing
As the most widely used mobile platform, Android is also the biggest target for mobile malware. Given the increasing number of Android malware variants, detecting malware families is crucial so that security analysts can identify situations where signatures of a known malware family can be adapted as opposed to manually inspecting behavior of all samples. We present EC2 (Ensemble Clustering and Classification), a novel algorithm for discovering Android malware families of varying sizes-ranging from very large to very small families (even if previously unseen). We present a performance comparison of several traditional classification and clustering algorithms for Android malware family identification on DREBIN, the largest public Android malware dataset with labeled families. We use the output of both supervised classifiers and unsupervised clustering to design EC2. Experimental results on both the DREBIN and the more recent Koodous malware datasets show that EC2 accurately detects both small and large families, outperforming several comparative baselines. Furthermore, we show how to automatically characterize and explain unique behaviors of specific malware families, such as FakeInstaller, MobileTx, Geinimi. In short, EC2 presents an early warning system for emerging new malware families, as well as a robust predictor of the family (when it is not new) to which a new malware sample belongs, and the design of novel strategies for data-driven understanding of malware behaviors.
- Research Article
21
- 10.1109/tc.2022.3143439
- Nov 1, 2022
- IEEE Transactions on Computers
Android malware is an ongoing threat to billions of smart devices’ security, ranging from mobile phones to car infotainment systems. Despite numerous approaches and previous studies to develop solutions for detecting and preventing Android malware, the rapid continuous development of new malware variants requires a careful reconsideration and the development of effective methods to identify malware families given a meager number of malware instances. In this paper, we present DroidMalVet, a novel Android malware family classification and detection approach that does not require to perform complex program analyses or utilize large feature sets. DroidMalVet is the first to use a promising, diverse, and small set of software metrics as features in a supervised learning platform to classify and detect various Android malware families. Our extensive empirical evaluations on two large public malware datasets show that DroidMalVet accurately detects both small and large malware families with F-Score accuracy of 94.4% and 96%, and AUC equal to 99.5% and 99.7% on the malware families in Drebin and AMD datasets, respectively. Moreover, our results demonstrate the superior performance of DroidMalVet in detecting small families (i.e., families with few samples). DroidMalVet complements existing approaches and presents an early warning tool for detecting known and emerging malware families.
- Research Article
15
- 10.1007/s13198-017-0692-7
- Dec 12, 2017
- International Journal of System Assurance Engineering and Management
The growth of Android mobile platform has led to the increase in the number of malicious applications. Malware creators are ahead of the malware detectors. In this paper, we present eight techniques of hiding a malicious Android application inside images (PNG/JPEG) by using methods such as Concatenation, Obfuscation, Cryptography, and Steganography separately and in conjunction. The image containing the malicious application is stored in the resources of another Android application. After hiding the malicious application using these techniques, we evaluated the vulnerability of ten popular and freely downloadable commercial Android anti-malwares towards them. The results were alarming as only one of them was able to detect two hiding techniques in which the malicious Android application (or its obfuscated version) was hidden by concatenating it at the end of an image and rest all the other anti-malwares were blind towards the eight hiding techniques. If the malicious Android application (or its obfuscated version) is not hidden inside an image but resides as it is in the resources of another Android application, seven out of ten anti-malwares flagged such applications as malicious. Such an evaluation provides a metric for measurement of the available defense against evolving Android malware and also aids in improving the state of the art of the Android malware detection systems.
- Research Article
44
- 10.17485/ijst/2016/v9i21/90273
- Jun 20, 2016
- Indian Journal of Science and Technology
Background/Objectives: Now a days, Android Malware is coded so wisely that it has become very difficult to detect them. The static analysis of malicious code is not enough for detection of malware as this malware hides its method call in encrypted form or it can install the method at runtime. The system call tracing is an effective dynamic analysis technique for detecting malware as it can analyze the malware at the run time. Moreover, this technique does not require the application code for malware detection. Thus, this can detect that android malware also which are difficult to detect with static analysis of code. As Android was launched in 2008, so there were fewer studies available regarding the behavior of Android Malware Families and their characteristics. The aim of this work is to explore the behavior of 10 popular Android Malware Families focused on System Call Pattern of these families. Methods/Statistical Analysis: For this purpose, the authors have extracted the system call trace of 345 malicious applications from 10 Android Malware Families named FakeInstaller, Opfake, Plankton, DroidKungFu, BaseBridge, Iconosys, Kmin, Adrd and Gappusin using strace android tool and compared it with the system calls pattern of 300 Benign Applications to justify the behavior of malicious application. Findings: During the experiment, it is observed that the malicious applications invoke some system calls more frequently than benign applications. Different Android malware invokes the different set of system calls with different frequency. Applications/Improvements: This analysis can prove helpful in designing intrusion-detection systems for an android mobile device with more accuracy. Keywords: Android Kernal, Android Malware Installation Methods, Malware Families, System Call Analysis
- Conference Article
23
- 10.1109/dsc.2017.77
- Jun 1, 2017
The analysis of multiple Android malware families indicates malware instances within a common malware family always have similar call graph structures. Based on the isomorphism of sensitive API call graph, we propose a method which is used to construct malware family features via combining static analysis approach with graph similarity metric. The experiment is performed on a malware dataset which contains 1326 malware samples from 16 different malware families. The result shows that the method can differentiate distinct malware family features and divide suspect malware samples into corresponding families with a high accuracy of 96.77% overall and even defend a certain extent of obfuscation.
- Book Chapter
25
- 10.1007/978-3-319-24018-3_12
- Jan 1, 2015
First we report on a new threat campaign, underway in Korea, which infected around 20,000 Android users within two months. The campaign attacked mobile users with malicious applications spread via different channels, such as email attachments or SMS spam. A detailed investigation of the Android malware resulted in the identification of a new Android malware family Android/BadAccents. The family represents current state-of-the-art in mobile malware development for banking trojans. Second, we describe in detail the techniques this malware family uses and confront them with current state-of-the-art static and dynamic code-analysis techniques for Android applications. We highlight various challenges for automatic malware analysis frameworks that significantly hinder the fully automatic detection of malicious components in current Android malware. Furthermore, the malware exploits a previously unknown tapjacking vulnerability in the Android operating system, which we describe. As a result of this work, the vulnerability, affecting all Android versions, will be patched in one of the next releases of the Android Open Source Project.
- Conference Article
23
- 10.1109/bigdata47090.2019.9005669
- Dec 1, 2019
Android malware (malicious apps) families share common attributes and behavior through sharing core malicious code. However, as the number of new malware increases, the task of identifying the correct family becomes more challenging. Two prominent approaches tackle this problem, either using dynamic analysis that captures the runtime behavior of the malware or using static analysis methods that can reveal malicious behavior by analyzing the underlying logic and code patterns. A third emerging way is to use the various sources of identification features to analyze the architectural and external attributes of a malicious app. For example, two malicious apps can have different behavioral patterns but share common attributes. We hypothesize that this malware can belong to the same family but attempt to mislead dynamic and code-level static analysis tools by randomizing their behavior. In this work, we utilize a promising set of Android-oriented code metrics that guide a supervised classification learning process for identifying malware families in Android. Our empirical results on 2,869 malware apps, across 35 different malware families, show that these metrics are very effective to identify malware families. In particular, we achieve low false positive rate (1.2%) and AUC score of 0.984 for family identification by using Random Forest (RF) classifier.
- Research Article
246
- 10.1016/j.eswa.2013.07.106
- Aug 13, 2013
- Expert Systems with Applications
Dendroid: A text mining approach to analyzing and classifying code structures in Android malware families
- Conference Article
- 10.1109/prdc.2018.00047
- Dec 1, 2018
Malware developers often use various obfuscation techniques to generate polymorphic and metamorphic versions of malicious programs. As a result, variants of a malware family generally exhibit resembling behavior, and most importantly, they possess certain common essential codes so to achieve the same designed purpose. Meantime, keeping up with new variants and generating signatures for each individual in a timely fashion has been costly and inefficient for anti-virus software companies. It motivates us the idea of no more dancing with variants. In this paper, we aim to find a malware family's main characteristic operations or activities directly related to its intent. We propose a novel automatic dynamic Android profiling system and malware family runtime behavior signature generation method called Runtime API sequence Motif Mining Algorithm (RasMMA) based on the analysis of the sensitive and permission-related execution traces of the threads and processes of a set of variant APKs of a malware family. We show the effectiveness of using the generated family signature to detect new variants using real-world dataset. Moreover, current anti-malware tools usually treat detection models as a black box for classification and offer little explanations on how malwares behave and how they proceed step by step to infiltrate targeted system and achieve the goal. We take malware family DroidKungFu as a case study to illustrate that the generated family signature indeed captures key malicious activities of the family.
- Research Article
4
- 10.1155/2022/3111540
- Jan 29, 2022
- Security and Communication Networks
Recently, security policies and behaviour detection methods have been proposed to improve the security of blockchain by many researchers. However, these methods cannot discover the source of typical behaviours, such as the malicious applications in the blockchain environment. Android application is an important part of the blockchain operating environment, and machine learning-based Android malware application detection method is significant for blockchain user security. The way of constructing features in these methods determines the performance. The single-feature mechanism, training classifiers with one type of features, cannot detect the malicious applications effectively which exhibit the typical behaviours in various forms. The multifeatures fusion mechanism, constructing mixed features from multiple types of data sources, can cover more kinds of information. However, different types of data sources will interfere with each other in the mixed features constructed by this mechanism. That limits the performance of the model. In order to improve the detection performance of Android malicious applications in complex scenarios, we propose an Android malicious application detection method which includes parallel feature processing and decision mechanism. Our method uses RGB image visualization technology to construct three types of RGB image which are utilized to train different classifiers, respectively, and a decision mechanism is designed to fuse the outputs of subclassifiers through weight analysis. This approach simultaneously extracts different types of features, which preserve application information comprehensively. Different classifiers are trained by these features to guarantee independence of each feature and classifier. On this basis, a comprehensive analysis of many methods is performed on the Android malware dataset, and the results show that our method has better efficiency and adaptability than others.
- Conference Article
17
- 10.1109/fuzz-ieee.2017.8015490
- Jul 1, 2017
Mobile systems have become essential for communication and productivity but are also becoming target of continuous malware attacks. New malware are often obtained as variants of existing malicious code. This work describes an approach for dynamic malware detection based on the combination of Process Mining (PM) and Fuzzy Logic (FL) techniques. The firsts are used to characterize the behavior of an application identifying some recurring execution expressed as a set of declarative constraints between the system calls. Fuzzy logic is used to classify the analyzed malware applications and verify their relations with the existing malware variants. The combination of the two techniques allows to obtain a fingerprint of an application that is used to verify its maliciousness/trustfulness, establish if it belongs from a known malware family and identify the differences between the detected malware behavior and the other variants of the some malware family. The approach is applied on a dataset of 3000 trusted and malicious applications across twelve malware families and has shown a very good discrimination ability that can be exploited for malware detection and family identification.
- Research Article
30
- 10.1016/j.diin.2013.06.011
- Aug 1, 2013
- Digital Investigation
Automated identification of installed malicious Android applications
- Conference Article
65
- 10.1109/msn.2012.43
- Dec 1, 2012
In this paper, we present a smart phone dual defense protection framework that allows Official and Alternative Android Markets to detect malicious applications among those new applications that are submitted for public release. Our framework consists of servers running on clouds where developers who wish to release their new applications can upload their software for verification purpose. The verification server first uses system call statistics to identify potential malicious applications. After verification, if the software is clean, the application will then be released to the relevant markets. To mitigate against false negative cases, users who run new applications can invoke our network traffic monitoring (NTM)tool which triggers network traffic capture upon detecting some suspicious behaviors e.g. detecting sensitive data being sent to output stream of an open socket. The network traffic will be analyzed to see if it matches network characteristics observed from malware applications. If suspicious network traffic is observed, the relevant Android markets will be notified tore move the application from the repository. We trained our system call and network traffic classifiers using 32 families of known Android malware families and some typical normal applications. Later, we evaluated our framework using other malware and normal applications that used in the training set. Our experimental results using 120 test applications (which consist of 50 malware and 70 normal applications) indicate that we can achieve a 94.2% and 99.2% accuracy with J.48 and Random forest classifier respectively using our framework.
- Research Article
53
- 10.1155/2020/5190138
- Jan 25, 2020
- Scientific Programming
Nowadays, Android applications declare as many permissions as possible to provide more function for the users, which also poses severe security threat to them. Although many Android malware detection methods based on permissions have been developed, they are ineffective when malicious applications declare few dangerous permissions or when the dangerous permissions declared by malicious applications are similar with those declared by benign applications. This limitation is attributed to the use of too few information for classification. We propose a new method named fine-grained dangerous permission (FDP) method for detecting Android malicious applications, which gathers features that better represent the difference between malicious applications and benign applications. Among these features, the fine-grained feature of dangerous permissions applied in components is proposed for the first time. We evaluate 1700 benign applications and 1600 malicious applications and demonstrate that FDP achieves a TP rate of 94.5%. Furthermore, compared with other related detection approaches, FDP can detect more malware families and only requires 15.205 s to analyze one application on average, which demonstrates its applicability for practical implementation.
- Research Article
12
- 10.1016/s1361-3723(14)70524-x
- Aug 1, 2014
- Computer Fraud & Security
The evolution of mobile malware