Benchmarking of Machine Learning for Anomaly Based Intrusion Detection Systems in the CICIDS2017 Dataset

Ziadoon Kamil Maseer,Salama A Mostafa,Nazrulazhar Bahaman,Cik Feresa Mohd Foozy,Robiah Yusof

doi:10.1109/access.2021.3056614

Abstract

An intrusion detection system (IDS) is an important protection instrument for detecting complex network attacks. Various machine learning (ML) or deep learning (DL) algorithms have been proposed for implementing anomaly-based IDS (AIDS). Our review of the AIDS literature identifies some issues in related work, including the randomness of the selected algorithms, parameters, and testing criteria, the application of old datasets, or shallow analyses and validation of the results. This paper comprehensively reviews previous studies on AIDS by using a set of criteria with different datasets and types of attacks to set benchmarking outcomes that can reveal the suitable AIDS algorithms, parameters, and testing criteria. Specifically, this paper applies 10 popular supervised and unsupervised ML algorithms for identifying effective and efficient ML-AIDS of networks and computers. These supervised ML algorithms include the artificial neural network (ANN), decision tree (DT), k-nearest neighbor (k-NN), naive Bayes (NB), random forest (RF), support vector machine (SVM), and convolutional neural network (CNN) algorithms, whereas the unsupervised ML algorithms include the expectation-maximization (EM), k-means, and self-organizing maps (SOM) algorithms. Several models of these algorithms are introduced, and the turning and training parameters of each algorithm are examined to achieve an optimal classifier evaluation. Unlike previous studies, this study evaluates the performance of AIDS by measuring the true positive and negative rates, accuracy, precision, recall, and F-Score of 31 ML-AIDS models. The training and testing time for ML-AIDS models are also considered in measuring their performance efficiency given that time complexity is an important factor in AIDSs. The ML-AIDS models are tested by using a recent and highly unbalanced multiclass CICIDS2017 dataset that involves real-world network attacks. In general, the k-NN-AIDS, DT-AIDS, and NB-AIDS models obtain the best results and show a greater capability in detecting web attacks compared with other models that demonstrate irregular and inferior results.

Highlights

As more platforms and applications are being connected to networks, data become increasingly vulnerable to malicious attacks
The machine learning (ML)-anomaly-based IDS (AIDS) algorithms are implemented by using Python3 in Anaconda 3 on a computer with OPTIPLEX 3010 Dell, Intel Core i3, 3.60 GHz processor, 4 GB primary memory, and 2 GB GPU functioning on Ubuntu 16.04
This study proposes a benchmarking approach that involves several steps and uses real data to ensure an effective evaluation of AIDS performance based on ML algorithms

Summary

Introduction

As more platforms and applications are being connected to networks, data become increasingly vulnerable to malicious attacks. Using an intrusion detection system (IDS) is a well-known approach for protecting computer networks. Two popular types of IDS, namely, network- (NIDS) and host-based IDS (HIDS), have been adopted in practice. NIDS monitors network traffic and detects any malicious activity in the network by analyzing the activities of end users [2]. IDS applies two types of detection methods, namely, signature- and anomaly-based methods. Signature-based IDS (or HIDS) detects attacks by identifying patterns (i.e., signatures) in IDS [3]. While this method can detect known malware and attacks based on their

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 250	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Benchmarking of Machine Learning for Anomaly Based Intrusion Detection Systems in the CICIDS2017 Dataset

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Benchmarking machine learning algorithms by inferring transportation modes from unlabeled GPS data
Hekmat Dabbas ... Bernhard Friedrich
Transportation Research Procedia | VOL. 62
Hekmat Dabbas, et. al.Hekmat Dabbas ... Bernhard Friedrich
01 Jan 2021
Transportation Research Procedia | VOL. 62

Foundation of Machine Learning-Based Data Classification Techniques for Health Care
Bindu Babu ... S Sudha
-
Bindu Babu, et. al.Bindu Babu ... S Sudha
25 May 2021
25 May 2021

RF Fingerprinting of LoRa Transmitters Using Machine Learning with Self-Organizing Maps for Cyber Intrusion Detection
Manish Nair ... Vaia Kalokidou
-
Manish Nair, et. al.Manish Nair ... Vaia Kalokidou
19 Jun 2022
19 Jun 2022

Identification of Somatic Gene Signatures in Circulating Cell-Free DNA Associated with Disease Progression in Metastatic Prostate Cancer by a Novel Machine Learning Platform.
Edwin Lin ... Taylor Mcfarland
The oncologist | VOL. 26
Edwin Lin, et. al.Edwin Lin ... Taylor Mcfarland
07 Jul 2021
The oncologist | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Benchmarking of Machine Learning for Anomaly Based Intrusion Detection Systems in the CICIDS2017 Dataset

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access