Abstract

Abstract Over the years, artificial neural networks have been applied successfully in many areas including IT security. Yet, neural networks can only process continuous input data. This is particularly challenging for security-related, non-continuous data like system calls of an operating system. This work focuses on five different options to preprocess sequences of system calls so that they can be processed by neural networks. These input options are based on one-hot encodings and learning word2vec, GloVe or fastText representations of system calls. As an additional option, we analyse if mapping system calls to their respective kernel modules is an adequate generalization step for (i) replacing system calls or (ii) enhancing system call data with additional information regarding their context. When performing such preprocessing steps it is important to ensure that no relevant information is lost during the process. The overall objective of system call analysis in the context of IT security is to categorize a sequence of them as benign or malicious behavior. Therefore, this scenario is used to evaluate different system call representations in a classification task. Results indicate that a broader range of attacks can be detected when enriching system call representations with corresponding kernel module information. Prior learning of embeddings does not achieve significant improvements. This work is an extension of the work by Wunderlich et al. [1] published in Advances in Intelligent Systems and Computing (AISC, volume 951).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call