Getting to the root of the problem: A detailed comparison of kernel and user level data for dynamic malware analysis

Pete Burnap,Philipp Reinecke,Kaelon Lloyd,Matthew Nunes,Omer Rana

doi:10.1016/j.jisa.2019.102365

Pete Burnap, Philipp Reinecke + Show 3 more

Open Access

https://doi.org/10.1016/j.jisa.2019.102365

Copy DOI

Abstract

Dynamic malware analysis is fast gaining popularity over static analysis since it is not easily defeated by evasion tactics such as obfuscation and polymorphism. During dynamic analysis it is common practice to capture the system calls that are made to better understand the behaviour of malware. There are several techniques to capture system calls, the most popular of which is a user-level hook. To study the effects of collecting system calls at different privilege levels and viewpoints, we collected data at a process-specific user-level using a virtualised sandbox environment and a system-wide kernel-level using a custom-built kernel driver. We then tested the performance of several state-of-the-art machine learning classifiers on the data. Random Forest was the best performing classifier with an accuracy of 95.2% for the kernel driver and 94.0% at a user-level. The combination of user and kernel level data gave the best classification results with an accuracy of 96.0% for Random Forest. This may seem intuitive but was hitherto not empirically demonstrated. Additionally, we observed that machine learning algorithms trained on data from the user-level tended to use the anti-debug/anti-vm features in malware to distinguish it from benignware. Whereas, when trained on data from our kernel driver, machine learning algorithms seemed to use the differences in the general behaviour of the system to make their prediction, which explains why they complement each other so well. Our results show that capturing data at different privilege levels will affect the classifier’s ability to detect malware, with kernel-level providing more utility than user-level for malware classification. Despite this, there exist more established user-level tools than kernel-level tools, suggesting more research effort should be directed at kernel-level. In short, this paper provides the first objective, evidence-based comparison of user and kernel level data for the purposes of malware classification.

Highlights

Malware, short for Malicious Software, is the all-encompassing term for unwanted software such as Viruses, Worms, and Trojans
The results show that the data from the kernel driver is marginally better for the purposes of differentiating between clean and malicious states regardless of the machine learning algorithm used
Motivated by a hypothesis that kernel level API calls and user level API calls do not produce the same classification results, we conducted experiments to understand the differences by collecting data at different privilege levels within the same Operating System

Summary

Introduction

Short for Malicious Software, is the all-encompassing term for unwanted software such as Viruses, Worms, and Trojans. Malware can be analysed in one of two ways; through static code analysis or dynamic behavioural analysis. Static code analysis involves studying the binary file and looking for patterns in its structure that might be indicative of malicious behaviour without ever running the binary. Dynamic behavioural analysis involves running the binary in a controlled environment, such as an emulated environment, or Virtual Machine (VM), and searching for patterns of Operating System (OS) calls or general system behaviour that are indicative of malicious behaviour. Static analysis has become less effective in recent years due to the fact that malware writers can circumvent detection methods using techniques such as code obfuscation and polymorphism [2,3]. Behavioural analysis has gained popularity since it runs malware in its preferred environment making it harder to evade detection completely

Methods

Results

Conclusion