Temporal logic guided safe model-based reinforcement learning: A hybrid systems approach

Max H Cohen,Zachary Serlin,Kevin Leahy,Calin Belta

doi:10.1016/j.nahs.2022.101295

Max H Cohen, Zachary Serlin + Show 2 more

Open Access

PDF Available

https://doi.org/10.1016/j.nahs.2022.101295

Copy DOI

Export

Save

Cite

Journal: Nonlinear Analysis: Hybrid Systems	Publication Date: Oct 19, 2022
Citations: 8	License type: publisher-specific-oa

Affiliation: Boston University

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

This paper studies the problem of synthesizing control policies for uncertain continuous-time nonlinear systems from linear temporal logic (LTL) specifications using model-based reinforcement learning (MBRL). Rather than taking an abstraction-based approach, we view the interaction between the LTL formula’s corresponding Büchi automaton and the nonlinear system as a hybrid automaton whose discrete dynamics match exactly those of the Büchi automaton. To find satisfying control policies, we pose a sequence of optimal control problems associated with states in the accepting run of the automaton and leverage control barrier functions (CBFs) to prevent specification violation. Since solving many optimal control problems for a nonlinear system is computationally intractable, we take a learning-based approach in which the value function of each problem is learned online in real-time. Specifically, we propose a novel off-policy MBRL algorithm that allows one to simultaneously learn the uncertain dynamics of the system and the value function of each optimal control problem online while adhering to CBF-based safety constraints. Unlike related approaches, the MBRL method presented herein decouples convergence, stability, and safety, allowing each aspect to be studied independently, leading to stronger safety guarantees than those developed in related works. Numerical results are presented to validate the efficacy of the proposed method.

Full Text