Network Intrusion Detection through Discriminative Feature Selection by Using Sparse Logistic Regression

Reehan Shah,Muhammad Alvi,Yuntao Qian,Dileep Kumar,Munwar Ali

doi:10.3390/fi9040081

Reehan Shah, Muhammad Alvi + Show 3 more

Open Access

https://doi.org/10.3390/fi9040081

Copy DOI

Abstract

Intrusion detection system (IDS) is a well-known and effective component of network security that provides transactions upon the network systems with security and safety. Most of earlier research has addressed difficulties such as overfitting, feature redundancy, high-dimensional features and a limited number of training samples but feature selection. We approach the problem of feature selection via sparse logistic regression (SPLR). In this paper, we propose a discriminative feature selection and intrusion classification based on SPLR for IDS. The SPLR is a recently developed technique for data analysis and processing via sparse regularized optimization that selects a small subset from the original feature variables to model the data for the purpose of classification. A linear SPLR model aims to select the discriminative features from the repository of datasets and learns the coefficients of the linear classifier. Compared with the feature selection approaches, like filter (ranking) and wrapper methods that separate the feature selection and classification problems, SPLR can combine feature selection and classification into a unified framework. The experiments in this correspondence demonstrate that the proposed method has better performance than most of the well-known techniques used for intrusion detection.

Highlights

As the proliferating growth of computer network activities and sensitive information on network systems increases, more and more organizations are becoming susceptible to a wider variety of attacks
In an effort to arm on the threat of future unknown cyber-attacks, considerable work has gone into investigating and growing intrusion detection systems to help filter out associated malware, exploits, and vulnerabilities
In the domain of network security (IDS), machine learning (ML), data mining (DM), and feature selection (FS) performs significant roles because many researchers are working on that domain to improve the performance of learning algorithms before applying in the different fields such as text mining, computer vision, and image processing etc

Summary

Introduction

As the proliferating growth of computer network activities and sensitive information on network systems increases, more and more organizations are becoming susceptible to a wider variety of attacks. The signature of attacks are known and defined priori and IDS attempts to identify the behavior as either normal or abnormal via analyzing the network connection sample to known intrusion pattern recognized by human specialists. Input features have sufficient knowledge about the normal and abnormal users; despite the fact that usually ADS detects novel or unknown attacks for the system but usually it has high false positive rates [6]. One serious challenge for IDS is feature selection (FS) from network traffic data and researchers employed feature selection techniques for the classification problems [8]. (4) The SPLR has shown different characteristics that are exceptionally suitable for IDS such as high classification accuracy (detection rate [DR]) and training time to build a model and average training time per sample etc (2) The SPLR reduces the cost function for IDS classification with a sparsity constraint. (3) Regularization through SPLR, feature selection has been mapped into penalty term of sparsity optimization in order to select more effective and interpretable features for IDS classification. (4) The SPLR has shown different characteristics that are exceptionally suitable for IDS such as high classification accuracy (detection rate [DR]) and training time to build a model and average training time per sample etc

Related Work

Experiment Design

Classification Detection Rate

Findings

Conclusions and Future Work