Combining Unsupervised Approaches for Near Real-Time Network Traffic Anomaly Detection

Stefano Galantucci,Vincenzo Dentamaro,Francesco Carrera,Giuseppe Pirlo,Andrea Iannacone,Donato Impedovo

doi:10.3390/app12031759

Stefano Galantucci, Vincenzo Dentamaro + Show 4 more

Open Access

https://doi.org/10.3390/app12031759

Copy DOI

Abstract

The 0-day attack is a cyber-attack based on vulnerabilities that have not yet been published. The detection of anomalous traffic generated by such attacks is vital, as it can represent a critical problem, both in a technical and economic sense, for a smart enterprise as for any system largely dependent on technology. To predict this kind of attack, one solution can be to use unsupervised machine learning approaches, as they guarantee the detection of anomalies regardless of their prior knowledge. It is also essential to identify the anomalous and unknown behaviors that occur within a network in near real-time. Three different approaches have been proposed and benchmarked in exactly the same condition: Deep Autoencoding with GMM and Isolation Forest, Deep Autoencoder with Isolation Forest, and Memory Augmented Deep Autoencoder with Isolation Forest. These approaches are thus the result of combining different unsupervised algorithms. The results show that the addition of the Isolation Forest improves the accuracy values and increases the inference time, although this increase does not represent a relevant problematic factor. This paper also explains the features that the various models consider most important for classifying an event as an attack using the explainable artificial intelligence methodology called Shapley Additive Explanations (SHAP). Experiments were conducted on KDD99, NSL-KDD, and CIC-IDS2017 datasets.

Highlights

Cyber attacks can impact the performance of networks, allow access to and the modification of confidential data, and compromise the security of virtually any infrastructure belonging to the network itself
The present work analyzes the state of the art of Anomaly Detection systems and how innovative Machine Learning algorithms can be used for this purpose
The study of Anomaly Detection systems has been guided by the application focus examined; i.e., identifying near real-time anomalous behaviors within a network

Summary

Introduction

Cyber attacks can impact the performance of networks (corporate or otherwise), allow access to and the modification of confidential data, and compromise the security of virtually any infrastructure belonging to the network itself. This issue can have a significant economic impact on a smart enterprise. It is of fundamental importance to create algorithms that are able to predict 0-day attacks in an automatic and fast way (near real-time). In a near real-time anomaly detection system, it is necessary to consider the processing speed of the algorithm and the pattern variations typical of attacks caused by the different network traffic that can be monitored in a real context. The system must be able to respond to changes in the monitored network traffic dynamically

Objectives

Methods

Results

Discussion

Conclusion