MsDroid: Identifying Malicious Snippets for Android Malware Detection

Yiling He,Kui Ren,Ziqi Yang,Zhan Qin,Yiping Liu,Lei Wu

doi:10.1109/tdsc.2022.3168285

Abstract

Machine learning has shown promise for improving the accuracy of Android malware detection in the literature. However, it is challenging to (1) stay robust towards real-world scenarios and (2) provide interpretable explanations for experts to analyse. In this paper, we propose MsDroid, an Android malware detection system that makes decisions by identifying malicious snippets with interpretable explanations. We mimic a common practice of security analysts, i.e., filtering APIs before looking through each method, to focus on local snippets around sensitive APIs instead of the whole program. Each snippet is represented with a graph encoding both code attributes and domain knowledge and then classified by GNN. To identify malicious snippets, we present a semi-supervised learning approach that only requires app labeling. To make malicious snippets less opaque, we design an explanation mechanism to show the importance of control flows and to retrieve similarly implemented snippets from known malwares. A comprehensive comparison with 5 baseline methods is conducted on a dataset of more than 81K apps in 3 real-world scenarios. The experimental results show that MsDroid is more robust than state-of-the-art systems in all cases, with 5.37% to 49.52% advantage in F1-score. Besides, we demonstrate how the explanations facilitate malware analysis

Full Text