Model-Free Reinforcement Learning for Optimal Control of Markov Decision Processes Under Signal Temporal Logic Specifications

Krishna C Kalagarla,Pierluigi Nuzzo,Rahul Jain

doi:10.1109/cdc45484.2021.9683444

Abstract

We present a model-free reinforcement learning (RL) algorithm to find an optimal policy for a finite-horizon Markov decision process (MDP) while guaranteeing a desired lower bound on the probability of satisfying a signal temporal logic (STL) specification. We propose a method to effectively augment the MDP state space to capture the required state history and express the STL objective as a reachability objective. The planning problem can then be formulated as a finite-horizon constrained Markov decision process (CMDP). For a general finite-horizon CMDP problem with unknown transition probability, we develop a reinforcement learning scheme that can leverage any model-free RL algorithm to provide an approximately optimal policy out of the general space of non-stationary randomized policies. We illustrate our approach in the context of robotic motion planning for complex missions under uncertainty and performance objectives.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Model-Free Reinforcement Learning for Optimal Control of Markov Decision Processes Under Signal Temporal Logic Specifications

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Cost-Optimal Control of Markov Decision Processes Under Signal Temporal Logic Constraints
Krishna C Kalagarla ... Rahul Jain
-
Krishna C Kalagarla, et. al.Krishna C Kalagarla ... Rahul Jain
20 Dec 2021
20 Dec 2021

Duality-Based Nested Controller Synthesis from STL Specifications for Stochastic Linear Systems
Susmit Jha ... Sumit Kumar Jha
-
Susmit Jha, et. al.Susmit Jha ... Sumit Kumar Jha
01 Jan 2018
01 Jan 2018

STL2vec: Signal Temporal Logic Embeddings for Control Synthesis With Recurrent Neural Networks
Wataru Hashimoto ... Kazumune Hashimoto
IEEE Robotics and Automation Letters | VOL. 7
Wataru Hashimoto, et. al.Wataru Hashimoto ... Kazumune Hashimoto
01 Apr 2022
IEEE Robotics and Automation Letters | VOL. 7

Reachability-based Control Synthesis under Signal Temporal Logic Specifications
Wei Ren ... Raphael Jungers
-
Wei Ren, et. al.Wei Ren ... Raphael Jungers
08 Jun 2022
08 Jun 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Model-Free Reinforcement Learning for Optimal Control of Markov Decision Processes Under Signal Temporal Logic Specifications

Abstract

Talk to us

Similar Papers