Falsification of Cyber-Physical Systems Using Deep Reinforcement Learning

Yoriyuki Yamagata,Takumi Akazaki,Jianye Hao,Yihai Duan,Shuang Liu

doi:10.1109/tse.2020.2969178

Abstract

A Cyber-Physical System (CPS) is a system which consists of software components and physical components. Traditional system verification techniques such as model checking or theorem proving are difficult to apply to CPS because the physical components have infinite number of states. To solve this problem, robustness guided falsification of CPS is introduced. Robustness measures how robustly the given specification is satisfied. Robustness guided falsification tries to minimize the robustness by changing inputs and parameters of the system. The input with a minimal robustness (counterexample) is a good candidate to violate the specification. Existing methods use several optimization techniques to minimize robustness. However, those methods do not use temporal structures in a system input and often require a large number of simulation runs to minimize the robustness. In this paper, we explore state-of-the-art Deep Reinforcement Learning (DRL) techniques, i.e., Asynchronous Advantage Actor-Critic (A3C) and Double Deep Q Network (DDQN), to reduce the number of simulation runs required to find such counterexamples. We theoretically show how robustness guided falsification of a safety property is formatted as a reinforcement learning problem. Then, we experimentally compare the effectiveness of our methods with three baseline methods, i.e., random sampling, cross entropy and simulated annealing, on three well known CPS systems. We thoroughly analyse the experiment results and identify two factors of CPS which make DRL based methods better than existing methods. The most important factor is the availability of the system internal dynamics to the reinforcement learning algorithm. The other factor is the existence of learnable structure in the counterexample.

Full Text