In Industrial Cyber–Physical Systems (ICPSs), the attacker can intrude into the cyber system through many penetration tools and attack the physical system. Payload-based traffic anomaly detection is a popular technique against these attacks. Due to the imbalanced distribution of normal and attack samples in ICPS, existing payload-based detection methods are mostly implemented based on unsupervised learning, typically comprising a word segmentation model and an unsupervised classifier. However, existing methods may disrupt semantic correlations and face challenges in extracting complex payload dependence relationships. To address these issues, this paper proposes a traffic anomaly detection approach, which consists of a data preprocessing model, an unsupervised word segmentation model, and an unsupervised classification model based on autoencoder. The unsupervised word segmentation model utilizes Long Short-Term Memory (LSTM) to calculate the probability of each word segmentation combination, effectively addressing the issue of inaccurate segmentation results in existing payload segmentation models. The unsupervised classification model, which combines 1D-Convolutional Neural Network (1D-CNN) and Bidirectional Encoder Representation from Transformers (BERT), addresses the challenge of extracting complex payload dependence relationships in existing classification models. The proposed detection approach is evaluated using a Cyber–Physical Attack Dataset (CPAD). Compared with the state-of-the-art detection approaches, the proposed approach has shown a significant improvement in Precision, with an increase of 18.83%. Additionally, the Recall has also been substantially enhanced, with a gain of 22.3%. Overall, the F1 has demonstrated a comprehensive improvement of 20.60%.
Read full abstract