Articles published on Android Malware Family
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
38 Search results
Sort by Recency
- Research Article
6
- 10.1007/s42979-024-03647-x
- Feb 21, 2025
- SN Computer Science
- Kshamta Chauhan + 1 more
Hybrid Sampling Technique for Imbalanced Android Malware Family Classification
- Research Article
7
- 10.1145/3708500
- Jan 22, 2025
- ACM Computing Surveys
- Tejpal Sharma + 1 more
Nowadays, smartphones have made our lives easier and have become essential gadgets for us. Apart from calling, mobiles are used for various purposes, such as banking, chatting, data storage, connecting to the internet, and running apps, that make life easier. Therefore, attackers are developing new methods or malware to steal smartphone data. Primarily, the study outlines various types of Android malware families, the evolution of Android malware and its effects on detection techniques over time. We report malware timelines and Android app datasets with their source web links. Data are collected from various recent studies and reported. In this study, we have reported 384 Android malware families and their year of discovery, i.e., from 2001 to 2020. According to the malfunctions they perform on the device, we categorized the families into 11 types. Information about datasets is divided into three categories, along with their source links, and is presented. The categorization and timeline of malware will make it easy for researchers to focus on upcoming trends according to the malware category and activities they perform. Various open issues and future challenges are also addressed for future researchers.
- Research Article
- 10.1049/ise2/8843518
- Jan 1, 2025
- IET Information Security
- K Sundara Krishnan + 1 more
The rapid growth and diversification of malware variants, driven by advanced code obfuscation, evasion, and antianalysis techniques, present a significant threat to cybersecurity. The inadequacy of traditional methods in accurately classifying these evolving threats highlights the need for effective and robust malware classification techniques. This article presents WinDroid, a novel visualization‐based framework for Windows and Android malware family (AMF) classification using hybrid features and hierarchical ensemble learning. The WinDroid system employs a multistage approach to malware classification, transforming binaries into Markov grayscale images, enhanced via contrast‐limited‐adaptive‐histogram‐equalization and gamma correction. Deep learning and handcrafted features are extracted and fuzed using graph attention networks (GATs), feeding into hierarchical support vector machines (SVMs) for accurate family classification. This framework effectively reduces information loss, enhances computational efficiency, and demonstrates outstanding performance. WinDroid delivers excellent results, achieving 99.53% accuracy on Windows and 99.65% on AMF classification, along with Cohen’s kappa coefficients of 99.01% and 99.28%, respectively, and outperforming state‐of‐the‐art baseline methods.
- Research Article
- 10.47065/josh.v6i1.6053
- Oct 21, 2024
- Journal of Information System Research (JOSH)
- Rajif Agung Yunmar
Malware poses a significant threat to cybersecurity, particularly for Android users. Each type of malware is categorized into distinct categories and families, each exhibiting unique malicious capabilities. Accurately identifying these categories and families is crucial for developing effective prevention and mitigation strategies, allowing for the control of threats before they worsen. Throughout the years, numerous techniques have been proposed for detecting malware families, with system calls emerging as a vital feature. Collected through dynamic analysis, system calls offer in-depth insights into the activities executed by malware, making them a powerful classification tool. This study aims to enhance the detection of Android malware families and categories by analyzing system calls with feature selection method. Using the Gain Ratio algorithm, significant system calls are identified to improve detection accuracy and reduce the complexity of the feature set. The study assesses machine learning algorithms, particularly Random Forest, J48, Naïve Bayes, and Decision Table. The findings show that Random Forest consistently outperforms other algorithms, achieving an accuracy of 88.01% for malware family detection and 89.65% for category detection, with high precision and recall across most metrics. The application of the Gain Ratio feature selection method led to a 68.83% feature reduction and improved model-building speed by 50.26%. This integration of feature selection and machine learning provides a more effective approach to detecting malware families and categories, thus contributing to enhanced Android security.
- Research Article
- 10.3233/jifs-219367
- Apr 2, 2024
- Journal of Intelligent & Fuzzy Systems
- Horacio Rodriguez-Bazan + 2 more
Recently, Android device usage has increased significantly, and malicious applications for the Android ecosystem have also increased. Security researchers have studied Android malware analysis as an emerging issue. The proposed methods employ a combination of static, dynamic, or hybrid analysis along with Machine Learning (ML) algorithms to detect and classify malware into families. These families often exhibit shared similarities among their members or with other families. This paper presents a new method that combines Fuzzy Hashing and Natural Language Processing (NLP) techniques to find Android malware families based on their similarities by applying reverse engineering to extract the features and compute fuzzy hashing of the preprocessed code. This relationship allows us to identify the families according to their features. A study was conducted using a database test of 2,288 samples from diverse ransomware families. An accuracy in classifying Android ransomware malware up to 98.46% was achieved.
- Research Article
10
- 10.1016/j.knosys.2024.111531
- Feb 16, 2024
- Knowledge-Based Systems
- Zhendong Wang + 3 more
FAGnet: Family-aware-based android malware analysis using graph neural network
- Research Article
21
- 10.1109/tc.2022.3143439
- Nov 1, 2022
- IEEE Transactions on Computers
- Karim O Elish + 2 more
Android malware is an ongoing threat to billions of smart devices’ security, ranging from mobile phones to car infotainment systems. Despite numerous approaches and previous studies to develop solutions for detecting and preventing Android malware, the rapid continuous development of new malware variants requires a careful reconsideration and the development of effective methods to identify malware families given a meager number of malware instances. In this paper, we present DroidMalVet, a novel Android malware family classification and detection approach that does not require to perform complex program analyses or utilize large feature sets. DroidMalVet is the first to use a promising, diverse, and small set of software metrics as features in a supervised learning platform to classify and detect various Android malware families. Our extensive empirical evaluations on two large public malware datasets show that DroidMalVet accurately detects both small and large malware families with F-Score accuracy of 94.4% and 96%, and AUC equal to 99.5% and 99.7% on the malware families in Drebin and AMD datasets, respectively. Moreover, our results demonstrate the superior performance of DroidMalVet in detecting small families (i.e., families with few samples). DroidMalVet complements existing approaches and presents an early warning tool for detecting known and emerging malware families.
- Research Article
5
- 10.1142/s0218126622502978
- Jul 15, 2022
- Journal of Circuits, Systems and Computers
- Zhiqiang Wang + 3 more
Due to the open-source and versatility of the Android operating system, Android malware has exploded, and the malware detection of Android IoT devices has become a research hotspot in recent years. Static analysis technology cannot effectively analyze obfuscated malware. Without decomposing, the existing detection methods are mainly based on grayscale images and single files without analyzing and verifying their anti-obfuscation performance. In addition, the current detection of Android malware using deep learning is concentrated in the field of binary classification. This paper proposes a multi-classification method of the Android malware family based on multi-class feature files and RGB images to solve these problems. The method proposed in this paper does not need to decompile the Android APK installation package. However, it extracts the DEX file and XML file in batch from the APK installation package. Then, it converts the file into an RGB image using the conversion algorithm that converts Android software into images. Finally, the deep neural network automatically obtains the RGB image texture features to realize the multiple classifications of the Android malware family. Experimental data show that the proposed method has high detection performance, and the accuracy of multiple classifications of the Android malware family is as high as 99.84%. In addition, the method based on RGB image is better than the grayscale image in detection accuracy, and the effect of RGB image combined with DEX and XML is better than that of separate DEX file image and separate XML file image. Therefore, the method proposed in this paper can effectively detect the obfuscated Android malware, and the detection accuracy of 99.23% can be achieved for the obfuscated sample data. Furthermore, this method has good anti-obfuscation ability. The proposed method is compared with those based on Multi-Layer Perceptron, Long Short-Term Memory, bidirectional Long Short-Term Memory and Deep Belief Network. The experimental results show the proposed method’s effectiveness and high generalization performance.
- Research Article
28
- 10.1109/access.2021.3139334
- Jan 1, 2022
- IEEE Access
- Hyun-Il Kim + 3 more
It is important to effectively detect, mitigate, and defend against Android malware attacks, because Android malware has long represented a major threat to Android app security. Characterizing and classifying similar malicious apps into groups plays a particularly crucial role in building a secure Android app ecosystem. The classification of malware families can efficiently enhance the malware detection process and systematically elucidate malware patterns. In this paper, we propose a novel efficient deep learning network with multi-streams for Android malware family classification. We first obtain the input data for a convolutional neural network (CNN) in string format from some main files or sections contained in each Android malicious app. We then classify malware families by applying a 1-dimensional convolution filter-based network for the files or sections. Further, by using gradient analysis to visualize the important files and sections in malicious apps, we attempt to intuitively grasp which files or sections are the most significant for malware family classification. To validate the effectiveness of our approach, we conduct extensive experiments with the well-known DREBIN and AMD malware datasets, and we compare our approach with existing methods. Our experimental results show that the 1D CNN model is more accurate than the 2D CNN model, and that the <monospace xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">code_item</monospace> part in the classes.dex is the most relevant feature for malware classification, as it is more relevant than other parts such as AndroidManifest.xml and certificate. The proposed method achieves the best accuracy of 93.2% by using 1D convolution filters with multi-streams for the main files and sections of the malware samples.
- Research Article
17
- 10.3390/app112110244
- Nov 1, 2021
- Applied Sciences
- Minki Kim + 5 more
Malware family classification is grouping malware samples that have the same or similar characteristics into the same family. It plays a crucial role in understanding notable malicious patterns and recovering from malware infections. Although many machine learning approaches have been devised for this problem, there are still several open questions including, “Which features, classifiers, and evaluation metrics are better for malware familial classification”? In this paper, we propose a machine learning approach to Android malware family classification using built-in and custom permissions. Each Android app must declare proper permissions to access restricted resources or to perform restricted actions. Permission declaration is an efficient and obfuscation-resilient feature for malware analysis. We developed a malware family classification technique using permissions and conducted extensive experiments with several classifiers on a well-known dataset, DREBIN. We then evaluated the classifiers in terms of four metrics: macrolevel F1-score, accuracy, balanced accuracy (BAC), and the Matthews correlation coefficient (MCC). BAC and the MCC are known to be appropriate for evaluating imbalanced data classification. Our experimental results showed that: (i) custom permissions had a positive impact on classification performance; (ii) even when the same classifier and the same feature information were used, there was a difference up to 3.67% between accuracy and BAC; (iii) LightGBM and AdaBoost performed better than other classifiers we considered.
- Research Article
7
- 10.3390/s21165671
- Aug 23, 2021
- Sensors (Basel, Switzerland)
- Mohammed Rashed + 1 more
The increasing number of Android malware forced antivirus (AV) companies to rely on automated classification techniques to determine the family and class of suspicious samples. The research community relies heavily on such labels to carry out prevalence studies of the threat ecosystem and to build datasets that are used to validate and benchmark novel detection and classification methods. In this work, we carry out an extensive study of the Android malware ecosystem by surveying white papers and reports from 6 key players in the industry, as well as 81 papers from 8 top security conferences, to understand how malware datasets are used by both. We, then, explore the limitations associated with the use of available malware classification services, namely VirusTotal (VT) engines, for determining the family of an Android sample. Using a dataset of 2.47 M Android malware samples, we find that the detection coverage of VT’s AVs is generally very low, that the percentage of samples flagged by any 2 AV engines does not go beyond 52%, and that common families between any pair of AV engines is at best 29%. We rely on clustering to determine the extent to which different AV engine pairs agree upon which samples belong to the same family (regardless of the actual family name) and find that there are discrepancies that can introduce noise in automatic label unification schemes. We also observe the usage of generic labels and inconsistencies within the labels of top AV engines, suggesting that their efforts are directed towards accurate detection rather than classification. Our results contribute to a better understanding of the limitations of using Android malware family labels as supplied by common AV engines.
- Research Article
49
- 10.3390/e23081009
- Aug 3, 2021
- Entropy
- Chao Ding + 3 more
With the popularity of Android, malware detection and family classification have also become a research focus. Many excellent methods have been proposed by previous authors, but static and dynamic analyses inevitably require complex processes. A hybrid analysis method for detecting Android malware and classifying malware families is presented in this paper, and is partially optimized for multiple-feature data. For static analysis, we use permissions and intent as static features and use three feature selection methods to form a subset of three candidate features. Compared with various models, including k-nearest neighbors and random forest, random forest is the best, with a detection rate of 95.04%, while the chi-square test is the best feature selection method. After using feature selection to explore the critical static features contained in this dataset, we analyzed a subset of important features to gain more insight into the malware. In a dynamic analysis based on network traffic, unlike those that focus on a one-way flow of traffic and work on HTTP protocols and transport layer protocols, we focused on sessions and retained protocol layers. The Res7LSTM model is then used to further classify the malicious and partially benign samples detected in the static detection. The experimental results show that our approach can not only work with fewer static features and guarantee sufficient accuracy, but also improve the detection rate of Android malware family classification from 71.48% in previous work to 99% when cutting the traffic in terms of the sessions and protocols of all layers.
- Research Article
113
- 10.1016/j.cose.2021.102399
- Jul 9, 2021
- Computers & Security
- Alejandro Guerra-Manzanares + 2 more
KronoDroid: Time-based Hybrid-featured Dataset for Effective Android Malware Detection and Characterization
- Research Article
- 10.5626/ktcp.2021.27.4.189
- Apr 30, 2021
- KIISE Transactions on Computing Practices
- Munyeong Kang + 4 more
안드로이드 악성소프트웨어가 지속적으로 증가함에 따라, 기계학습을 사용한 안드로이드 악성소프트웨어 탐지 및 분류 기법이 많이 연구되고 있다. 악성소프트웨어 패밀리(malware family) 분류는, 악성소프트웨어 샘플들을 연관성 있는 그룹으로 분류하는 기법으로 컴퓨터 포렌식 분석, 위협 평가, 위협완화 계획에 중요한 역할을 한다. 본 논문에서는 실행파일 중의 일부를 회색조 이미지(grayscale image)로 변환한 후 변환된 영상들을 대상으로 딥러닝 기법을 적용하여 안드로이드 악성소프트웨어 패밀리를 분류하는 방법을 제안한다. 대표적인 안드로이드 악성소프트웨어 데이터 셋(dataset)인 Drebin에서 제공되는 악성소프트웨어 대표 패밀리들을 대상으로 합성곱신경망(Convolutional Neural Network, CNN) 모델을 적용하여 악성소프트웨어를 분류한다. 본 실험의 연구 결과를 기존 연구 결과와 비교하여, 데이터 경량화와 적절한 데이터 크기의 선정, 정확도에 있어 본 연구가 악성소프트웨어 분류에 더 효과적임을 보인다.
- Research Article
14
- 10.1109/tse.2021.3067061
- Mar 19, 2021
- IEEE Transactions on Software Engineering
- Francesco Mercaldo + 1 more
Several techniques to overcome the weaknesses of the current signature based detection approaches adopted by free and commercial antimalware have been proposed by industrial and research communities. These techniques are mainly supervised machine learning based, requiring optimal class balance to generate good predictive models. In this paper, we propose a method to infer mobile application maliciousness by detecting the belonging family, exploiting formal equivalence checking. We introduce a set of heuristics to reduce the number of mobile application comparisons and we define a metric reflecting the application maliciousness. Real-world experiments on 35 Android malware families (ranging from 2010 to 2018) confirm the effectiveness of the proposed method in mobile malware detection and family identification.
- Research Article
- 10.5626/ktcp.2021.27.2.116
- Feb 28, 2021
- KIISE Transactions on Computing Practices
- Heejin Kim + 3 more
안드로이드 악성 앱이 급속도로 증가함에 따라 다양한 악성 앱 분류 연구가 진행되고 있다. 그 중에서도 악성 앱 패밀리 분류 연구는 변종의 악성 앱이 출현해도 특징 정보에 따라 빠르게 악성 앱을 분류할 수 있게 한다. 따라서, 본 논문에서는 미국 구글사의 Virustotal에서 제공하는 악성 앱 레이블에 레이블 정확도를 가중치로 반영하여 효율적으로 악성 앱 패밀리를 분류하는 기법을 제안한다. 제안하는 악성 앱 패밀리 분류 기법에서 사용하는 악성 앱 레이블의 레이블 정확도는 다양한 연구에서 활용된 악성 앱 데이터셋 분류 정보와 악성 앱 레이블 분석 보고서의 분류 정보를 기반으로 추출한다. 이후 제안한 기법을 Drebin과 AMD가 제공하는 악성 앱 데이터셋에 적용하고 패밀리 분류 성능을 측정한 결과 제안한 기법이 기존 분류 기법보다 분류 성능이 좋아짐을 보인다.
- Research Article
40
- 10.1080/09540091.2021.1889977
- Feb 23, 2021
- Connection Science
- Gianni D'Angelo + 3 more
Due to their open nature and popularity, Android-based devices have attracted several end-users around the World and are one of the main targets for attackers. Because of the reasons given above, it is necessary to build tools that can reliably detect zero-day malware on these devices. At the moment, many of the frameworks that have been proposed to detect malware applications leverage Machine Learning (ML) techniques. However, an essential requirement to build these frameworks consists of using very large and sophisticated datasets for model construction and training purposes. Their success, indeed, strongly depends on the choice of the right features used for building a classification model providing adequate generalisation capability. Furthermore, the creation of a training dataset that well represents the malware properties and behaviour is one of the most critical challenges in malware analysis. Therefore, the main aim of this paper is proposing a new dataset called Unisa Malware Dataset (UMD) available on http://antlab.di.unisa.it/malware/, which is based on the extraction of static and dynamic features characterising the malware activities. Additionally, we will show some experiments concerning common ML tools to demonstrate how it is possible to build efficient ML-based malware classification frameworks using the proposed dataset.
- Research Article
1
- 10.1504/ijahuc.2021.10042892
- Jan 1, 2021
- International Journal of Ad Hoc and Ubiquitous Computing
- Aamir Rasool + 2 more
SHA-AMD: sample-efficient hyper-tuned approach for detection and identification of Android malware family and category
- Research Article
6
- 10.1504/ijahuc.2021.119097
- Jan 1, 2021
- International Journal of Ad Hoc and Ubiquitous Computing
- Aamir Rasool + 2 more
SHA-AMD: sample-efficient hyper-tuned approach for detection and identification of Android malware family and category
- Research Article
23
- 10.1016/j.comnet.2020.107639
- Oct 28, 2020
- Computer Networks
- Yude Bai + 4 more
Comparative analysis of feature representations and machine learning methods in Android family classification