Abstract

Network intrusion detection is an important technology for maintaining cybersecurity. The inherent difficulties co-existing in network traffic datasets, such as class imbalance, class overlapping, and noises, limit detection accuracy. However, existing research has only focused on the disproportion between classes. Considering these difficulties, a novel ensemble method is proposed for network intrusion detection called DUEN. Dynamic undersampling is incorporated into the Boosting framework to provide a relatively balanced training subset for each iteration to achieve a strong ensemble. In dynamic undersampling, the sampling number is determined based on data distribution learning to cope with the small sample overfitting problem caused by simple undersampling. The concept of classification hardness for integrating data difficulties is employed to identify samples as easy, borderline, and noisy types. Boundary samples importance enhancement mechanism is proposed for the undersampling process, preventing overfitting noise samples. The proposed sampling method does not involve distance calculation, achieving very low computational costs. Experiments show the effectiveness of DUEN with multiclass classification on two widely used datasets for evaluating intrusion detection. DUEN has high detection accuracy for both minority and majority class samples and a significant time efficiency advantage. The Macro.F1 of NSL-KDD and UNSW-NB15 datasets reached 70.7% and 50.1%, achieving 6.32% and 1.42% improvement compared with the suboptimal network intrusion detection methods. Moreover, the experimental analysis shows that DUEN has strong generalization and robustness to noise samples.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call