Cyber-physical systems (CPS) play a vital role in modern society across various sectors, ranging from smart grid to water treatment, and their security has become one of the major concerns. Due to the significantly growing complexity and scale of CPS and cyber-attacks, it is imperative to develop defense and prevention strategies specifically for CPS that are adaptive, scalable, and robust. An important research and application direction in this domain is time series anomaly detection within CPS utilizing advanced machine learning techniques, such as deep learning and reinforcement learning. However, many anomaly detectors fail to balance between detection performance and computational overhead, limiting their applicability in CPS. In this paper, we introduce a novel agent-based dynamic thresholding (ADT) method based on the deep reinforcement learning technique, i.e. deep Q-network (DQN), to model thresholding in anomaly detection as a Markov decision process. By utilizing anomaly scores generated from an autoencoder and other useful information perceived from a simulated environment, ADT performs the optimal dynamic thresholding control, facilitating real-time adaptive anomaly detection for time series. Rigorous evaluations were conducted on realistic datasets from water treatment and industrial control systems, specifically SWaT, WADI, and HAI, comparing against established benchmarks. The experimental results demonstrate ADT's superior detection performance, dynamic thresholding capability, data-efficient learning, and robustness. Notably, ADT, even when trained on minimal labeled data, consistently outperforms benchmarks with F1 scores ranging from 0.995 to 0.999 across all datasets. It is effective even in challenging scenarios where the environmental feedback is noisy, delayed, or partial. Beyond its direct application as an advanced anomaly detector, ADT possesses the versatility to act as a lightweight dynamic thresholding controller, boosting other anomaly detection models. This underscores ADT's considerable promise in sophisticated and dynamic CPS environments.
Read full abstract