Learning from Context: A Multi-View Deep Learning Architecture for Malware Detection

Adarsh Kyadige,Konstantin Berlin,Ethan M Rudd

doi:10.1109/spw50608.2020.00018

Abstract

Machine learning (ML) classifiers used for malware detection typically employ numerical representations of the content of each file when making malicious/benign determinations. However, there is also relevant information that can be gleaned from the context in which the file was seen which is often ignored. One source of contextual information is the file's location on disk. For example, a malicious file masquerading as a known benign file (e.g., a Windows system DLL) is more likely to appear suspicious if the detector can intelligibly utilize information about the path at which it resides. Knowledge of the file path information could also make it easier to detect files which try to evade disk scans by placing themselves in specific locations. File paths are also available with little overhead and can seamlessly be integrated into a multi-view static ML detector, potentially yielding higher detection rates at very high throughput and minimal infrastructural changes. In this work, we propose a multi-view deep neural network architecture, which takes feature vectors from the PE file content as well as corresponding file paths as inputs and outputs a detection score. We perform an evaluation on a commercial-scale dataset of approximately 10 million samples - files and file paths from user endpoints serviced by an actual security vendor. We then conduct an interpretability analysis via LIME modeling to ensure that our classifier has learned a sensible representation and examine how the file path contributes to change in the classifier's score in different cases. We find that our model learns useful aspects of the file path for classification, resulting in a 26.6% improvement in the true positive rate at a 0.001 false positive rate (FPR) and a 64.6% improvement at 0.0001 FPR, compared to a model that operates on PE file content only.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning from Context: A Multi-View Deep Learning Architecture for Malware Detection

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Malware Detection Method Based on File and Registry Operations Using Machine Learning
Ömer Aslan ... Erdal Akin
Sakarya University Journal of Computer and Information Sciences | VOL. 5
Ömer Aslan, et. al.Ömer Aslan ... Erdal Akin
31 Aug 2022
Sakarya University Journal of Computer and Information Sciences | VOL. 5

Malware Detection Method using Tree-based Machine Learning Algorithms
Satoshi Okada ... Mariko Fujimoto
-
Satoshi Okada, et. al.Satoshi Okada ... Mariko Fujimoto
17 Nov 2021
17 Nov 2021

The dose makes the poison — Leveraging uncertainty for effective malware detection
Ruimin Sun ... Xiaoyong Yuan
-
Ruimin Sun, et. al.Ruimin Sun ... Xiaoyong Yuan
06 Jun 2017
06 Jun 2017

International Liver Cancer Association (ILCA) White Paper on Biomarker Development for Hepatocellular Carcinoma
Amit G Singal ... Augusto Villanueva
Gastroenterology | VOL. 160
Amit G Singal, et. al.Amit G Singal ... Augusto Villanueva
09 Mar 2021
Gastroenterology | VOL. 160

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning from Context: A Multi-View Deep Learning Architecture for Malware Detection

Abstract

Talk to us

Similar Papers