Abstract

This paper investigates a safe learning problem that satisfies linear temporal logic (LTL) constraints with persistent adversarial inputs, and quantified performance and robustness. Via a finite state automaton, the LTL specification is first decomposed to a sequence of several two point boundary value problems (TPBVP), each of which has an invariant safety zone. Then we employ a system transformation that guarantees state, and control safety with logarithmic barrier and hyperbolic-type functions as well as a worst-case adversarial input that wants to push the system outside the safety set. A safe learning method is used to solve the sub-problem, where the actors (approximators of the optimal control and the worst- case adversarial inputs) and the critic (approximator of the cost) are tuned to learn the optimal policies without violating any safety. Finally, by following a Lyapunov stability analysis we prove boundedness of the closed-loop system while simulation results are used to validate the effectiveness.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.