Abstract

Network intrusion detection is a relatively mature research topic, but one that remains challenging particular as technologies and threat landscape evolve. Here, a semi-supervised tri-Adaboost (STA) algorithm is proposed. In the algorithm, three different Adaboost algorithms are used as the weak classifiers (both for continuous and categorical data), constituting the decision stumps in the tri-training method. In addition, the chi-square method is used to reduce the dimension of feature and improve computational efficiency. We then conduct extensive numerical studies using different training and testing samples in the KDDcup99 dataset and discover the flows demonstrated that (1) high accuracy can be obtained using a training dataset which consists of a small number of labeled and a large number of unlabeled samples. (2) The algorithm proposed is reproducible and consistent over different runs. (3) The proposed algorithm outperforms other existing learning algorithms, even with only a small amount of labeled data in the training phase. (4) The proposed algorithm has a short execution time and a low false positive rate, while providing a desirable detection rate.

Highlights

  • Date received: 18 January 2019; accepted: 1 April 2019 Handling Editor: Shancang Li. In this age of connectivity, where services, utilities, and the majority of everyday tasks are reliant on the Internet, vast amounts of personal data are stored ‘‘in the cloud.’’1 Network attacks that impact the availability or the confidentiality of these services may result in significant losses.[1]

  • The intrusion detection model for the proposed supervised tri-Adaboost (STA) algorithm is presented in section ‘‘Proposed network intrusion detection algorithm.’’ Evaluation results are shown in section ‘‘Evaluations.’’ in section ‘‘Conclusion,’’ the conclusion and research work are presented

  • Bold faced values for STA represents that, STA has a better performance on some metrics, such as false positive rate, compared with those of SSM

Read more

Summary

Introduction

In this age of connectivity, where services, utilities, and the majority of everyday tasks are reliant on the Internet, vast amounts of personal data are stored ‘‘in the cloud.’’1 Network attacks that impact the availability or the confidentiality of these services may result in significant losses.[1]. Due to the constant evolution of today’s network traffic, the majority of the recently obtained and stored data are unlabeled in practice It is time-consuming and expensive and in some cases, impractical, to obtain a large amount of labeled dataset to improve the detection accuracy, in our data-driven society. The disadvantage of unsupervised methods is the artificial classification of samples, which has lower detection efficiency and accuracy To overcome these challenges, semi-supervised learning algorithms designed to leverage unlabeled dataset and labeled ones are becoming increasingly popular. We propose a semi-supervised intrusion detection system (SS-IDS) by combining tri-training[12] with three different Adaboost algorithms. This combination scheme allows us to minimize the number of false positive, reduce the time consumption, and increase the detection accuracy of the system. The intrusion detection model for the proposed STA algorithm is presented in section ‘‘Proposed network intrusion detection algorithm.’’ Evaluation results are shown in section ‘‘Evaluations.’’ in section ‘‘Conclusion,’’ the conclusion and research work are presented

Related work
Findings
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.