Abstract

This article describes an experimental investigation into the inter-dataset generalization of supervised machine learning methods, trained to distinguish between benign and several classes of malicious network flows. The first part details the process and results of establishing reference classification scores on CIC-IDS2017 and CSE-CIC-IDS2018, two modern, labeled data sets for testing intrusion detection systems. The data sets are divided into several days each pertaining to different attack classes (DoS, DDoS, infiltration, botnet, etc.). A pipeline has been created that includes twelve supervised learning algorithms from different families. Subsequently to this comparative analysis the DoS / SSL and botnet attack classes, which are represented in both data sets and are well-classified by many algorithms, have been selected to test the inter-dataset generalization strength of the trained models. Exposure of these models to unseen, but related samples without additional training was expected to maintain high classification performance, but this assumption is shown to be erroneous (at least for the tested attack classes). To our knowledge, there is no prior literature that validates the efficacy of supervised ML-based intrusion detection systems outside of the dataset(s) on which they have been trained. Our first results question the implied link that great intra-dataset generalization leads to great inter- or extra-dataset generalization. Further experimentation is required to discover the scope and causes of this deficiency as well as potential solutions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call