SeGDroid: An Android malware detection method based on sensitive function call graph learning

Zhen Liu,Ruoyu Wang,Nathalie Japkowicz,Heitor Murilo Gomes,Bitao Peng,Wenbin Zhang

doi:10.1016/j.eswa.2023.121125

Abstract

Malware is still a challenging security problem in the Android ecosystem, as malware is often obfuscated to evade detection. In such case, semantic behavior feature extraction is crucial for training a robust malware detection model. In this paper, we propose a novel Android malware detection method (named SeGDroid) that focuses on learning the semantic knowledge from sensitive function call graphs (FCGs). Specifically, we devise a graph pruning method to build a sensitive FCG on the base of an original FCG. The method preserves the sensitive API (security-related API) call context and removes the irrelevant nodes of FCGs. We propose a node representation method based on word2vec and social-network-based centrality to extract attributes for graph nodes. Our representation aims at extracting the semantic knowledge of the function calls and the structure of graphs. Using this representation, we induce graph embeddings of the sensitive FCGs associated with node attributes using a graph convolutional neural network algorithm. To provide a model explanation, we further propose a method that calculates node importance. This creates a mechanism for understanding malicious behavior. The experimental results show that SeGDroid achieves an F-score of 98% in the case of malware detection on the CICMal2020 dataset and an F-score of 96% in the case of malware family classification on the MalRadar dataset. In addition, the provided model explanation is able to trace the malicious behavior of the Android malware.

Full Text