Input splits design techniques for network intrusion detection on Hadoop cluster

Vladimir Ciric,Ivan Milentijevic,Nadja Gavrilovic,Natalija Stojanovic,Dusan Cvetkovic

doi:10.2298/fuee2102239c

Vladimir Ciric, Ivan Milentijevic + Show 3 more

Open Access

https://doi.org/10.2298/fuee2102239c

Copy DOI

Abstract

Intrusion detection system (IDS) is one of the most important components being used to monitor network for possible cyber-attacks. However, the amount of data that should be inspected imposes a great challenge to IDSs. With recent emerge of various big data technologies, there are ways for overcoming the problem of the increased amount of data. Nevertheless, some of this technologies inherit data distribution techniques that can be a problem when splitting a sensitive data such as network data frames across a cluster nodes. The goal of this paper is design and implementation of Hadoop based IDS. In this paper we propose different input split techniques suitable for network data distribution across cloud nodes and test the performances of their Apache Hadoop implementation. Four different data split techniques will be proposed and analysed. The techniques will be described in detail. The system will be evaluated on Apache Hadoop cluster with 17 slave nodes. We will show that processing speed can differ for more than 30% depending on chosen input split design strategy. Additionally, we?ll show that malicious level of network traffic can slow down the processing time, in our case, for nearly 20%. The scalability of the system will al so be discussed.

Full Text