Cyber creative GAN for novel malicious packets

John Pavlik,Nathaniel D Bastian,Raghuveer M Rao,Kimberly E Manser,Christopher L Howell

doi:10.1117/12.2663700

Abstract

Machine learning (ML) requires both quantity and variety of examples in order to learn generalizable patterns. In cybersecurity, labeling network packets is a tedious and difficult task. This leads to insufficient labeled datasets of network packets for training ML-based Network Intrusion Detection Systems (NIDS) to detect malicious intrusions. Furthermore, benign network traffic and malicious cyber attacks are always evolving and changing, meaning that the existing datasets quickly become obsolete. We investigate generative ML modeling for network packet synthetic data generation/augmentation to improve NIDS detection of novel, but similar, cyber attacks by generating well-labeled synthetic network traffic. We develop a Cyber Creative Generative Adversarial Network (CCGAN), inspired by previous generative modeling to create new art styles from existing art images, trained on existing NIDS datasets in order to generate new synthetic network packets. The goal is to create network packet payloads that appear malicious but from different distributions than the original cyber attack classes. We use these new synthetic malicious payloads to augment the training of a ML-based NIDS to evaluate whether it is better at correctly identifying whole classes of real malicious packet payloads that were held-out during classifier training. Results show that data augmentation from CCGAN can increase a NIDS baseline accuracy on a novel malicious class from 79% to 97% with a minimal degradation in accuracy on benign classes (98.9% to 98.7%).

Full Text