Payload-based anomaly detection (PAD) model is commonly built upon a big data of normal payload samples, and hence is able to discover zero-day attacks and unknown faults without the need of any negative samples in training phase. But such detection model encounters new challenges to adapt well to the emerging industrial Internet of things (IIoT). That is, the modern industrial processes are usually running in a very high complexity, resulting the payloads much more complex and diverse. Further, the industrial data is likely too sensitive to be shared in public, and thus induces a new privacy concern. To tackle these challenges, we propose PAD, a novel privacy-preserved payload-based anomaly detection model for IIoT. The basic idea is to train a two-dimensional convolutional neural network (2D-CNN) based auto-encoder using normal payloads over a federated GAN (Generative Adversarial Network) architecture, and then to detect anomalies by an unexpected dissimilarity between the original payloads and the payloads reconstructed by the auto-encoder. By the 2D-CNN, we can model the normal payloads from both the request and response directions simultaneously, and thus have more opportunities to capture the complex and dynamic industrial behaviors that are possibly reflected in the bi-directional network communications. By the GAN, we can train a more generalized auto-encoder that is able to reconstruct more general payload samples without the need to have them in advance for model training. By the federated architecture, we can remove the need of direct sharing of normal payloads, and learn them indirectly by aggregating local models across different industrial data owners, hence ensuring the payload privacy. We have evaluated PAD using four public industrial payload datasets as well as considering four typical IIoT PAD scenarios. The detection results achieve more than 0.966 in terms of F1 score for global condition and at least 0.753 for all kinds of federated settings, proving the effectiveness of our PAD with privacy preserved.
Read full abstract