Towards a generalized hybrid deep learning model with optimized hyperparameters for malicious traffic detection in the Industrial Internet of Things

Mohammed Abubaker,Bilal Babayigit

doi:10.1016/j.engappai.2023.107515

Abstract

Detecting malicious attacks in Industrial Internet of Things (IIoT) is crucial to minimize downtime and financial losses. However, existing deep learning (DL) research faces limitations in reliability and generalization due to reliance on a single dataset. This single dataset approach hinders the effectiveness of the models when applied to new and unseen datasets. This paper addresses the issue using multiple-domain learning, creating a generalized DL framework for IIoT traffic classification. It combines three datasets: Edge-IIoTSet, WUSTL-IIoT-2021, and X-IIoTID. The proposed framework has two stages. The first stage involves the fusion of different datasets into a common feature space. To achieve this, an autoencoder architecture is proposed to match the dimensionality of the datasets into a common feature space. Subsequently, a modified locally linear embedding is used as a manifold alignment to ensure statistical matching of multiple datasets such that the maximum mean discrepancy between multiple datasets is minimized to compare and combine datasets effectively. The second stage presents a hybrid DL model with a convolutional neural network (CNN) and gated recurrent unit (GRU) for IIoT traffic classification. Bayesian optimization fine-tunes hyperparameters. Experiments on combined, single, and cross-datasets show the superiority of the framework. The CNN-GRU model achieved 97.68% accuracy, 97.70% recall, 97.67% precision, and 97.68% F1-score for binary classification on the combined dataset. Transfer learning improved accuracy from 50.95% to 97.80% and F1-score from 48.52% to 97.79% when trained on X-IIoTID and tested on Edge-IIoTSet. Multi-classification accuracy and weighted averages of recall, precision, and F1-score were approximately 98.85%.

Full Text