Abstract

Due to the severe imbalance in the quantities of normal samples and attack samples, as well as among different types of attack samples, intrusion detection systems suffer from low detection rates for rare-class attack data. In this paper, we propose a geometric synthetic minority oversampling technique based on the optimized kernel density estimation algorithm. This method can generate diverse rare-class attack data by learning the distribution of rare-class attack data while maintaining similarity with the original sample features. Meanwhile, the balanced data is input to a feature extraction module built upon multiple denoising autoencoders, reducing information redundancy in high-dimensional data and improving the detection performance for unknown attacks. Subsequently, a soft-voting ensemble learning technique is utilized for multi-class anomaly detection on the balanced and dimensionally reduced data. Finally, an intrusion detection system is constructed based on data preprocessing, imbalance handling, feature extraction, and anomaly detection modules. The performance of the system was evaluated using two datasets, NSL-KDD and N-BaIoT, achieving 86.39% and 99.94% multiclassification accuracy, respectively. Through ablation experiments and comparison with the baseline model, it is found that the inherent limitations of a single machine-learning model directly affect the accuracy of the intrusion detection system, while the superiority of the proposed multi-module model in detecting unknown attacks and rare classes of attack traffic is demonstrated.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.