Malware Generation with Specific Behaviors to Improve Machine Learning-based Detection

Michael R Smtith,Nicholas T Johnson,Raga Krishnakumar,Kanad Khanna,Sophie Quynn,Stephen J Verzi,Xin Zhou

doi:10.1109/bigdata52589.2021.9671886

Michael R Smtith, Nicholas T Johnson + Show 5 more

https://doi.org/10.1109/bigdata52589.2021.9671886

Copy DOI

Abstract

We describe efforts in generating synthetic malware samples that have specified behaviors that can then be used to train a machine learning (ML) algorithm to detect behaviors in malware. The idea behind detecting behaviors is that a set of core behaviors exists that are often shared in many malware variants and that being able to detect behaviors will improve the detection of novel malware. However, empirically the multi-label task of detecting behaviors is significantly more difficult than malware classification, only achieving on average 84% accuracy across all behaviors as opposed to the greater than 95% multi-class or binary accuracy reported in many malware detection studies. One of the difficulties in identifying behaviors is that while there are ample malware samples, most data sources do not include behavioral labels, which means that generally there is insufficient training data for behavior identification. Inspired by the success of generative models in improving image processing techniques, we examine and extend a 1) conditional variational auto-encoder and 2) a flow-based generative model for malware generation with behavior labels. Initial experiments indicate that synthetic data is able to capture behavioral information and increase the recall of behaviors in novel malware from 32% to 45% without increasing false positives and to 52% with increased false positives.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Malware Generation with Specific Behaviors to Improve Machine Learning-based Detection

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Malware Detection and Classification Framework for IOT Devices
Sayali Khirid ... Sakshi Veer
International Journal of Advanced Research in Science, Communication and Technology | VOL. -
Sayali Khirid, et. al. Sayali Khirid ... Sakshi Veer
20 May 2022
International Journal of Advanced Research in Science, Communication and Technology | VOL. -

A New Malware Classification Framework Based on Deep Learning Algorithms
Omer Aslan ... Abdullah Asim Yilmaz
IEEE Access | VOL. 9
Omer Aslan, et. al.Omer Aslan ... Abdullah Asim Yilmaz
01 Jan 2020
IEEE Access | VOL. 9

Utilization of synthetic minority oversampling technique for improving potato yield prediction using remote sensing data and machine learning algorithms with small sample size of yield data
Hamid Ebrahimy ... Zhou Zhang
ISPRS Journal of Photogrammetry and Remote Sensing | VOL. 201
Hamid Ebrahimy, et. al.Hamid Ebrahimy ... Zhou Zhang
24 May 2023
ISPRS Journal of Photogrammetry and Remote Sensing | VOL. 201

Incorporating known malware signatures to classify new malware variants in network traffic
Ismahani Ismail ... Sulaiman Mohd Nor
International Journal of Network Management | VOL. 25
Ismahani Ismail, et. al.Ismahani Ismail ... Sulaiman Mohd Nor
28 Sep 2015
International Journal of Network Management | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Malware Generation with Specific Behaviors to Improve Machine Learning-based Detection

Abstract

Talk to us

Similar Papers