Abstract

Because of their feasibility and effectiveness, artificial intelligence-based intrusion detection systems attract considerable interest from researchers. However, when confronted with large-scale data sets, many artificial intelligence-based intrusion detection systems could suffer from a high computational burden, even though the feature selection method can help to reduce the computational complexity. To improve the efficiency, we propose a representative instance selection method to preprocess the original data set before training a classifier, which is independent of the learning algorithm that is used for constructing the intrusion detection system. In this study, a new metric is introduced to measure the representative power of an instance with respect to its class. Based on an implementation of representativeness, we select the most representative instance in each subset divided by a novel centroid-based partitioning strategy, and then, we utilise the result as training data to build various intrusion detection models efficiently. Experimental results on a labelled flow-based data set introduced in 2009 show that ANN, KNN, SVM and Liblinear learning with a largely reduced set of representative instances can not only achieve high efficiency in detecting network attacks but also provide comparable detection performance in terms of the detection rate, precision, F-score and accuracy, as compared with four corresponding classifiers built with the original large data set.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call