Learning Control Policies for Stochastic Systems with Reach-Avoid Guarantees

Đorđe Žikelić,Thomas A Henzinger,Mathias Lechner,Krishnendu Chatterjee

doi:10.1609/aaai.v37i10.26407

Abstract

We study the problem of learning controllers for discrete-time non-linear stochastic dynamical systems with formal reach-avoid guarantees. This work presents the first method for providing formal reach-avoid guarantees, which combine and generalize stability and safety guarantees, with a tolerable probability threshold p in [0,1] over the infinite time horizon. Our method leverages advances in machine learning literature and it represents formal certificates as neural networks. In particular, we learn a certificate in the form of a reach-avoid supermartingale (RASM), a novel notion that we introduce in this work. Our RASMs provide reachability and avoidance guarantees by imposing constraints on what can be viewed as a stochastic extension of level sets of Lyapunov functions for deterministic systems. Our approach solves several important problems -- it can be used to learn a control policy from scratch, to verify a reach-avoid specification for a fixed control policy, or to fine-tune a pre-trained policy if it does not satisfy the reach-avoid specification. We validate our approach on 3 stochastic non-linear reinforcement learning tasks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Learning Control Policies for Stochastic Systems with Reach-Avoid Guarantees

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jun 26, 2023
Citations: 5

Similar Papers

Jointly Learning Environments and Control Policies with Projected Stochastic Gradient Ascent
Adrien Bolland ... Damien Ernst
Journal of Artificial Intelligence Research | VOL. 73
Adrien Bolland, et. al.Adrien Bolland ... Damien Ernst
05 Jan 2022
Journal of Artificial Intelligence Research | VOL. 73

Dissipativity, inverse optimal control, and stability margins for nonlinear discrete-time stochastic feedback regulators
Wassim M Haddad ... Manuel Lanchares
International Journal of Control | VOL. 96
Wassim M Haddad, et. al.Wassim M Haddad ... Manuel Lanchares
01 Jun 2021
International Journal of Control | VOL. 96

Statistical verification of dynamical systems using set oriented methods
Yu Wang ... Mahesh Viswanathan
-
Yu Wang, et. al.Yu Wang ... Mahesh Viswanathan
14 Apr 2015
14 Apr 2015

Lyapunov Theorems for Finite Time and Fixed Time Semistability of Discrete-Time Stochastic Systems
Junsoo Lee ... Wassim M Haddad
-
Junsoo Lee, et. al.Junsoo Lee ... Wassim M Haddad
04 Mar 2023
04 Mar 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning Control Policies for Stochastic Systems with Reach-Avoid Guarantees

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence