Abstract

Intrusion Detection Systems, specifically Network Anomaly Detection Systems (NADSs) are vital tools in network security. The NADSs are affected by data imbalance issues in classifying minority classes. Also, designing an efficient detection framework is sought after to achieve a higher detection rate for minority classes, especially when utilizing ensemble learning methods. To solve the issue of imbalanced data, a hybrid method of sampling techniques is proposed. This imbalance processing tool integrates the Synthetic Minority Oversampling Technique (SMOTE) and the K-means clustering algorithm (SKM). SMOTE over-samples the minority class, and K-means is used to perform a cluster-based under-sampling. We use Denoising Autoencoder (DAE) to select the top 15 features to reduce data dimensionality based on their higher weights. For anomaly detection, the XGBoost algorithm is deployed and the SHapley Additive exPlanation (SHAP) approach is deployed to provide explanations of the proposed techniques. The performance of the SKM-XGB model is assessed using the NSL-KDD and UNSW-NB15 datasets. A comparative analysis and series of experiments were carried out using several ensemble models with multiple base classifiers. The experimental findings indicate that the model's detection rate for binary classification and multiclass classification using the UNSW-NB15 dataset is 99.01% and 97.49%, respectively. The model achieves a 99.37% detection rate for binary classification and a 99.22% detection rate for multiclass classification on the NSL-KDD dataset. We conducted a comparative analysis of various ensemble models with multiple base classifiers. The results indicate that SKM-XGB outperforms the other investigated models and outperforms the performance of state-of-the-art models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call