Abstract

Abstract The goal of anomaly-based intrusion detection is to build a system which monitors computer network behaviour and generates alerts if either a known attack or an anomaly is detected. Anomaly-based intrusion detection system detects intrusions based on a reference model which identifies normal behaviour of the computer network and flags an anomaly. Basic challenges in anomaly-based detection are difficulties to identify a ‘normal’ network behaviour and complexity of the dataset needed to train the intrusion detection system. Supervised machine learning can be used to train the binary classifiers in order to recognize the notion of normality. In this paper we present an algorithm for feature selection and instances normalization which reduces the Kyoto 2006+ dataset in order to increase accuracy and decrease time for training, testing and validating intrusion detection systems based on five models: k-Nearest Neighbour (k-NN), weighted k-NN (wk-NN), Support Vector Machine (SVM), Decision Tree, and Feedforward Neural Network (FNN).

Highlights

  • Intrusion detection systems (IDSs) protect computer networks from malicious activities which compromise network security and affect the confidentiality, integrity and availability of the data

  • This paper presents five models used for binary classification: k-Nearest Neighbour (k-NN), weighted k-NN (wk-NN) (Hechenbichler & Schliep, 2004), Support Vector Machine (SVM) (Burgess, 1998, 283-298), Decision Tree (Sebastiani, 2002, 13) and Feedforward Neural Network (FNN) (Protic & Milosavljevic, 2006, 643-646)

  • We present results of the experiments conducted to the normalized instances and five models: k-NN, wk-NN, SVM, Decision Tree and FNN

Read more

Summary

Introduction

Intrusion detection systems (IDSs) protect computer networks from malicious activities which compromise network security and affect the confidentiality, integrity and availability of the data. The goal of anomaly-based detection is to build a statistical model that describes the normal behaviour of the computer network and looks for activities which differ from the created model. It detects both intrusions and/or misuse, and classifies them as either ‘normal’ or ‘anomaly’. The data set is designed to provide evaluation of the network-based intrusion detection systems (NIDS) It consists of 14 statistical features derived from the KDD Cup '99 dataset (1999) and 10 additional features which can be used for evaluation and further analyses of NIDS (Protic, 2018, 580-595). Authors extracted 10 additional features: ‘Label’ which indicated normal traffic or attacks, four features describing source and destination addresses and port numbers, two features describing start time and duration of the session, and three features for IDS, malware and Ashula detection (See Table 2)

Feature Selection
Machine Learning Models
Weighted k-Nearest Neighbour
Support Vector Machine
Decision Tree
Feedforward Neural Network
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call