DMalNet: Dynamic malware analysis based on API feature engineering and graph learning

Yan Wang,Ce Li,Zijun Cheng,Leiqi Wang,Degang Sun,Ning Li,Qiujian Lv,He Zhu

doi:10.1016/j.cose.2022.102872

Abstract

Application Programming Interfaces (APIs) are widely considered a useful data source for dynamic malware analysis to understand the behavioral characteristics of malware. However, the accuracy of API-based malware analysis is limited for two reasons. (1) Existing solutions often only consider the API names while ignoring the API arguments, or cannot fully exploit the semantic information from different types of arguments. (2) The relationship between API calls is important to describe the software behavior but is difficult to capture. To overcome the above limitations, we propose DMalNet, a novel malware analysis framework for accurate malware detection and classification. Specifically, we first present a hybrid feature encoder to extract semantic features from API names and arguments. Then, we derive an API call graph from the API call sequence to convert the relationship between API calls into the structural information of the graph. Finally, we design a graph neural network to implement the malware detection and type classification. To evaluate our approach, we use datasets of over 20k benign and 18k malicious samples belonging to 8 malware types. DMalNet achieves 98.43% and 91.42% accuracy on malware detection and malware type classification, respectively. We also conduct ablation studies to assess the impact of API feature engineering and the graph learning model. Further experiments show that DMalNet can effectively detect malware.

Full Text