The pervasiveness in the industrial internet of things (IIoT) due to the application of supervisory control and data acquisition (SCADA) has led to the growth of heterogeneous sensor data, thereby increasing the risk of intrusions and attacks. The existence and effect of intruders and their innovative attack techniques are on the rise. Existing intrusion detection systems (IDS) tend to be computationally expensive with this form of data due to the presence of noise. In real-time domains, available methods lag, necessitating additional research into effective feature extraction schemes, which is fundamental in machine learning (ML) for time exigency. This study, in a comparative analysis of some feature selection techniques (FS), proposes a combination of an efficient ML classifier and an agnostic feature selection (FS) scheme for attack detection and classification in a real-time SCADA network. The flexibility and interoperability of the proposed approach resolve the computational complexity of vulnerability detection schemes while reducing false alarm rates (FAR) and overall model execution time. With the view of an online preprocessing, the proposed technique is phased thus: (i) data preparatory consisting of data cleansing and normalization followed by (ii) the combination of a pre-pruned decision tree (DT) algorithm and an agnostic Chi-square FS approach built to obtain an optimal subset of data features for efficient IDS. (iii) Evaluation of proposed agnostic DT-CH and other FS candidates for anomaly detection.
Read full abstract