Network Traffic Classification Using Feature Selections and two-tier stacked classifier

Rahul Adhao,Vinod Pachghare

doi:10.47164/ijngc.v12i5.422

Abstract

The datasets available for IDS performance evaluations are noisy and highly imbalanced. The noisiness of the dataset can be reduced with dataset pre-processing and feature selection approach. These datasets contain many records for some class labels (e.g., DoS, DDoS, Port Scan: majority attacks) and very few records for other class labels (e.g., U2R, R2L: minority attacks), making it imbalanced. Applying a single machine learning algorithm (classifier) on such datasets confuses the classifiers. The classifier becomes biased towards majority attack records and may fail to detect minority attacks. One possible solution to reduce these class imbalances of the dataset is to divide this dataset in terms of majority and minority attacks. The proposed approach divides the dataset into majority and minority groups to solve the issue raised by the imbalance dataset and uses two-tier classification approaches to classify majority and minority attacks. The CICIDS2017 dataset and NSL-KDD dataset are used for the evaluation of the proposed system. The proposed system gives an accuracy of 98.30% for the CICIDS 2017 dataset and 99.71% for the NSL-KDD dataset. The model’s performance is explored in terms of precision, accuracy, and F1 score, which has been observed to be superior to existing works in the field of intrusion detection.

Full Text