The application of new techniques to increase the performance of intrusion detection systems is crucial in modern data networks with a growing threat of cyber-attacks. These attacks impose a greater risk on network services that are increasingly important from a social end economical point of view. In this work we present a novel application of several deep reinforcement learning (DRL) algorithms to intrusion detection using a labeled dataset. We present how to perform supervised learning based on a DRL framework.The implementation of a reward function aligned with the detection of intrusions is extremely difficult for Intrusion Detection Systems (IDS) since there is no automatic way to identify intrusions. Usually the identification is performed manually and stored in datasets of network features associated with intrusion events. These datasets are used to train supervised machine learning algorithms for classifying intrusion events. In this paper we apply DRL using two of these datasets: NSL-KDD and AWID datasets. As a novel approach, we have made a conceptual modification of the classic DRL paradigm (based on interaction with a live environment), replacing the environment with a sampling function of recorded training intrusions. This new pseudo-environment, in addition to sampling the training dataset, generates rewards based on detection errors found during training.We present the results of applying our technique to four of the most relevant DRL models: Deep Q-Network (DQN), Double Deep Q-Network (DDQN), Policy Gradient (PG) and Actor-Critic (AC). The best results are obtained for the DDQN algorithm.We show that DRL, with our model and some parameter adjustments, can improve the results of intrusion detection in comparison with current machine learning techniques. Besides, the classifier obtained with DRL is faster than alternative models. A comprehensive comparison of the results obtained with other machine learning models is provided for the AWID and NSL-KDD datasets, together with the lessons learned from the application of several design alternatives to the four DRL models.
Read full abstract