Abstract

This paper investigates a safe learning problem that satisfies linear temporal logic (LTL) constraints with persistent adversarial inputs, and quantified performance and robustness. Via a finite state automaton, the LTL specification is first decomposed to a sequence of several two point boundary value problems (TPBVP), each of which has an invariant safety zone. Then we employ a system transformation that guarantees state, and control safety with logarithmic barrier and hyperbolic-type functions as well as a worst-case adversarial input that wants to push the system outside the safety set. A safe learning method is used to solve the sub-problem, where the actors (approximators of the optimal control and the worst- case adversarial inputs) and the critic (approximator of the cost) are tuned to learn the optimal policies without violating any safety. Finally, by following a Lyapunov stability analysis we prove boundedness of the closed-loop system while simulation results are used to validate the effectiveness.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call