Abstract

A <i>Cyber-Physical System</i> (CPS) is a system which consists of software components and physical components. Traditional system verification techniques such as model checking or theorem proving are difficult to apply to CPS because the physical components have infinite number of states. To solve this problem, robustness guided falsification of CPS is introduced. Robustness measures how robustly the given specification is satisfied. Robustness guided falsification tries to minimize the robustness by changing inputs and parameters of the system. The input with a minimal robustness (counterexample) is a good candidate to violate the specification. Existing methods use several optimization techniques to minimize robustness. However, those methods do not use temporal structures in a system input and often require a large number of simulation runs to minimize the robustness. In this paper, we explore state-of-the-art <i>Deep Reinforcement Learning</i> (DRL) techniques, i.e., <i>Asynchronous Advantage Actor-Critic</i> (A3C) and <i>Double Deep Q Network</i> (DDQN), to reduce the number of simulation runs required to find such counterexamples. We theoretically show how robustness guided falsification of a safety property is formatted as a reinforcement learning problem. Then, we experimentally compare the effectiveness of our methods with three baseline methods, i.e., random sampling, cross entropy and simulated annealing, on three well known CPS systems. We thoroughly analyse the experiment results and identify two factors of CPS which make DRL based methods better than existing methods. The most important factor is the availability of the system internal dynamics to the reinforcement learning algorithm. The other factor is the existence of learnable structure in the counterexample.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call