Abstract
Deep Learning based Intrusion Detection Systems (IDSs) have received significant attention from the research community for their capability to handle modern-day security systems in large-scale networks. Despite their considerable improvement in performance over machine learning-based techniques and conventional statistical models, deep neural networks (DNN) suffer from catastrophic forgetting: the model forgets previously learned information when trained on newer data points. This vulnerability is specifically exaggerated in large scale systems due to the frequent changes in network architecture and behaviours, which leads to changes in data distribution and the introduction of zero-day attacks; this phenomenon is termed as covariate shift. Due to these constant changes in the data distribution, the DNN models will not be able to consistently perform at high accuracy and low false positive rate (FPR) rates without regular updates. However, before we update the DNN models, it is essential to understand the magnitude and nature of the drift in the data distribution. In this paper, to analyze the drift in data distribution, we propose an eight-stage statistics and machine learning guided implementation framework that objectively studies and quantifies the changes. Further, to handle the changes in data distribution, most IDS solutions collect the network packets and store them to retrain the DNN models periodically, but when the network’s size and complexity increase, those tasks become expensive. To efficiently solve this problem, we explore the potential of continual learning models to incrementally learn new data patterns while also retaining their previous knowledge. We perform an experimental and analytical study of advanced intrusion detection systems using three major continual learning approaches: learning without forgetting, experience replay, and dark experience replay on the NSL-KDD and the CICIDS 2017 dataset. Through extensive experimentation, we show that our continual learning models achieve improved accuracy and lower FPR rates when compared to the state-of-the-art works while also being able to incrementally learn newer data patterns. Finally, we highlight the drawbacks of traditional statistical and non-gradient based machine learning approaches in handling the covariate shift problem.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have