Abstract

With the continuous emergence of various network attacks, it is becoming more and more important to ensure the security of the network. Intrusion detection, as one of the important technologies to ensure network security, has been widely studied. However, class imbalance leads to a challenging problem, that is, the normal data is much more than the attack data. Class imbalance will lead to the deviation of decision boundary, which makes higher value attack data classification error. In the face of imbalanced data, how to make the classification model classify more effectively is called imbalanced learning problem. In this study, we propose a tabular data sampling method to solve the imbalanced learning problem, which aims to balance the normal samples and attack samples. Firstly, for normal samples, on the premise of minimizing the loss of sample information, the K-nearest neighbor method is used for effective undersampling. Then, we design a tabular auxiliary classifier generative adversarial networks model (TACGAN) for attack sample oversampling. TACGAN model is an extension of ACGAN model. We add two loss functions in the generator to measure the information loss between real data and generated data, which makes TACGAN more suitable for the generation of tabular data. Finally, the normal data after undersampling and the attack data after oversampling are mixed to balance the data. We have carried out verification experiments on three real intrusion detection data sets. Experimental results show that the proposed method achieves excellent results in Accuracy, F1, AUC and Recall.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.