The rise in cyber security issues has caused significant harm to tech world and thus society in recent years. Intrusion detection systems (IDSs) are crucial for the detection and the mitigation of the increasing risk of cyber attacks. False and disregarded alarms are a common problem for traditional IDSs in high-bandwidth and large-scale network systems. While applying learning techniques to intrusion detection, researchers are facing challenges mainly due to the imbalanced training sets and the high dimensionality of datasets, resulting from the scarcity of attack data and longer training periods, respectively. Thus, this leads to reduced efficiency. In this research study, we propose a strategy for dealing with the problems of imbalanced datasets and high dimensionality in IDSs. In our efficient and novel framework, we integrate an oversampling strategy that uses Generative Adversarial Networks (GANs) to overcome the difficulties introduced by imbalanced datasets, and we use the Random Forest (RF) importance algorithm to select a subset of features that best represent the dataset to reduce the dimensionality of a training dataset. Then, we use three deep learning techniques, Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM), to classify the attacks. We implement and evaluate this proposed framework on the CICIDS2017 dataset. Experimental results show that our proposed framework outperforms state-of-the-art approaches, vastly improving DL model detection accuracy by 98% using CNN.
Read full abstract