Adversarial environment reinforcement learning algorithm for intrusion detection

Guillermo Caminero,Manuel Lopez-Martin,Belen Carro

doi:10.1016/j.comnet.2019.05.013

Abstract

Intrusion detection is a crucial service in today’s data networks, and the search for new fast and robust algorithms that are capable of detecting and classifying dangerous traffic is essential to deal with changing threats and increasing detection difficulty. In this work, we present a new intrusion detection algorithm with an excellent prediction performance. The prediction is based on a classifier which is a simple and extremely fast neural network. The classifier implements a policy function that is trained with a novel reinforcement learning model, where the behavior of the environment is adjusted in parallel with the learning process.Intrusion detection frameworks are based on a supervised learning paradigm that uses a training dataset composed of network features and associated intrusion labels. In this work, we integrate this paradigm with a reinforcement learning algorithm that is normally based on interaction with a live environment (not a pre-recorded dataset). To perform the integration, the live environment is replaced by a simulated one.The principle of this approach is to provide the simulated environment with an intelligent behavior by, first, generating new samples by randomly extracting them from the training dataset, generating rewards that depend on the goodness of the classifier's predictions, and, second, by further adjusting this initial behavior with an adversarial objective in which the environment will actively try to increase the difficulty of the prediction made by the classifier. In this way, the simulated environment acts as a second agent in an adversarial configuration against the original agent (the classifier). We prove that this architecture increases the final performance of the classifier.This work presents the first application of adversarial reinforcement learning for intrusion detection, and provides a novel technique that incorporates the environment's behavior into the learning process of a modified reinforcement learning algorithm.We prove that the proposed algorithm is adequate for a supervised learning problem based on a labeled dataset. We validate its performance by comparing it with other well-known machine learning models for two datasets. The proposed model outperforms the other models in the weighted Accuracy (>0.8) and F1 (>0.79) metrics, and especially excels in the results for the under-represented labels.

Full Text