Abstract

Due to the insidious characteristics of network intrusion behaviors, developing an efficient intrusion detection system is still a big challenge, especially in the era of big data where the number of traffic and the dimension of each traffic feature are high. Because of the shortcomings of traditional common machine learning algorithms in network intrusion detection, such as insufficient accuracy, a network intrusion detection system based on LightGBM and autoencoder (AE) is proposed. The LightGBM-AE model proposed in this paper includes three steps: data preprocessing, feature selection, and classification. The LightGBM-AE model adopts the LightGBM algorithm for feature selection, and then uses an autoencoder for training and detection. When a set of data containing network intrusion behaviors are inputted into an autoencoder, there is a large reconstruction error between the original input data and the reconstructed data obtained by the autoencoder, which provides a basis for intrusion detection. According to the reconstruction error, an appropriate threshold is set to distinguish symmetrically between normal behavior and attack behavior. The experiment is carried out on the NSL-KDD dataset and implemented using Pytorch. In addition to autoencoder, variational autoencoder (VAE) and denoising autoencoder (DAE) are also used for intrusion detection and are compared with existing machine learning algorithms such as Decision Tree, Random Forest, KNN, GBDT, and XGBoost. The evaluation is carried out through classification evaluation indexes such as accuracy, precision, recall, F1-score. The experimental results show that the method can efficiently separate the attack behavior from normal behavior according to the reconstruction error. Compared with other methods, the effectiveness and superiority of this method are verified.

Highlights

  • In recent years, computer networks have developed rapidly, gradually playing the role of central information systems in modern life

  • The detection methods of intrusion detection systems are divided into two categories depending on the modeling methods used [2]: one is based on misuse detection methods, and the other is based on abnormal detection methods

  • Autoencoder is an unsupervised deep learning framework designed to reconstruct the input in the output while minimizing the reconstruction error [18]

Read more

Summary

Introduction

Computer networks have developed rapidly, gradually playing the role of central information systems in modern life. The increase in the size, application, and infrastructure of computer networks has exposed them to various serious threats such as malicious activities, network intruders, and network criminals. Dealing with these harmful network activities is one of the priorities and important research fields in the world today. The misuse-based detection method uses signatures that compare known attacks to detect. This method is effective for known attacks, but it is not effective in detecting unknown attacks

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call