Abstract

Malware is a constant threat and is continuously evolving. Security systems try to keep up with the constant change. One challenge that arises is the large amount of logs generated on an operating system and the need to clarify which information contributes to the detection of possible malware. This work aims at the detection of malware using neural networks based on Windows audit log events. Neural networks can only process continuous data, but Windows audit logs are sequential and textual data. To address these challenges, we extract features out of the audit log events and use LSTMs to capture sequential effects. We create different subsets of features and analyze the effects of additional information. Features describe for example the action-type of windows audit log events, process names or target files that are accessed. Textual features are represented either as one-hot encoding or embedding representation, for which we compare three different approaches for representation learning. Effects of different feature subsets and representations are evaluated on a publicly available data set. Results indicate that using additional information improves the performance of the LSTM-model. While different representations lead to similar classification results, analysis of the latent space shows differences more precisely where FastText seems to be the most promising representation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.