Abstract

With the continuous development of network technology, an intrusion detection system needs to face detection efficiency and storage requirement when dealing with large data. A reasonable way of alleviating this problem is instance selection, which can reduce the storage space and improve intrusion detection efficiency by selecting representative instances. An instance is representative not only in its class but also in different classes. This representativeness reflects the importance of an instance. Since the existing instance selection algorithm does not take into account the above situations, some selected instances are redundant and some important instances are removed, increasing storage space and reducing efficiency. Therefore, a new representativeness of instance is proposed and considers not only the influence of all instances of the same class on the selected instance but also the influence of instances of different classes on the selected instance. Moreover, it considers the influence of instances of different classes as an advantageous factor. Based on this representativeness, two instance selection algorithms are proposed to handle balanced and imbalanced data problems for intrusion detection. One is a representative-based instance selection for balanced data, which is named RBIS and selects the same proportion of instances from each class. The other is a representative-based instance selection for imbalanced data, which is named RBIS-IM and selects important majority instances according to the number of instances of the minority class. Compared with other algorithms on the benchmark data sets of intrusion detection, experimental results verify the effectiveness of the proposed RBIS and RBIS-IM algorithms and demonstrate that the proposed algorithms can achieve a better balance between accuracy and reduction rate or between balanced accuracy and reduction rate.

Highlights

  • Along with the continuous development of network technology and 5G, smart systems are becoming more and more common in all fields of human life, such as finance, agriculture, and education

  • Compared with other algorithms on the benchmark data sets of intrusion detection, RBIS algorithm can achieve a better balance between accuracy and reduction rate

  • After analyzing the instance selection algorithm and its defects in intrusion detection, we propose a new representativeness of instance to determine the importance of an instance

Read more

Summary

Introduction

Along with the continuous development of network technology and 5G, smart systems are becoming more and more common in all fields of human life, such as finance, agriculture, and education. As intrusion detection technology can effectively protect smart systems and detect attacks, the development of intrusion detection technology has attracted the attention of countries all over the world [1, 2]. From the perspective of classification, the main goal of building an intrusion detection system (IDS) is to train a classifier that can distinguish between normal and intrusive data from the original network data set. E IDS based on machine learning has become an important part of IDS [3], which directly uses a large amount of network data to detect attacks. Ese network data can result in wasting time and storage space for IDS. Instance selection is used for IDS to select important data from the original data to achieve two goals. One is to reduce the number of instances required by IDS in the training phase, thereby saving time and reducing the amount of calculation for training the classifier; the other is that through effective instances, the performance of the trained classifier can be effectively improved [4,5,6]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call