Cost-Optimal Control of Markov Decision Processes Under Signal Temporal Logic Constraints

Krishna C Kalagarla,Pierluigi Nuzzo,Rahul Jain

doi:10.1109/icc54714.2021.9703164

Abstract

We present a method to find a cost-optimal policy for a given finite-horizon Markov decision process (MDP) with unknown transition probability, such that the probability of satisfying a given signal temporal logic specification is above a desired threshold. We propose an augmentation of the MDP state space to enable the expression of the STL objective as a reachability objective. In this augmented space, the optimal policy problem is re-formulated as a finite-horizon constrained Markov decision process (CMDP). We then develop a model-free reinforcement learning (RL) scheme to provide an approximately optimal policy for any general finite horizon CMDP problem. This scheme can make use of any off-the-shelf model-free RL algorithm and considers the general space of non-stationary randomized policies. Finally, we illustrate the applicability of our RL-based approach through two case studies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Cost-Optimal Control of Markov Decision Processes Under Signal Temporal Logic Constraints

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Model-Free Reinforcement Learning for Optimal Control of Markov Decision Processes Under Signal Temporal Logic Specifications
Krishna C Kalagarla ... Rahul Jain
-
Krishna C Kalagarla, et. al.Krishna C Kalagarla ... Rahul Jain
14 Dec 2021
14 Dec 2021

Quickest change detection approach to optimal control in Markov decision processes with model changes
Taposh Banerjee ... Miao Liu
-
Taposh Banerjee, et. al.Taposh Banerjee ... Miao Liu
01 May 2017
01 May 2017

Contraction Mappings in the Theory Underlying Dynamic Programming
Eric V Denardo
SIAM Review | VOL. 9
Eric V DenardoEric V Denardo
01 Apr 1967
SIAM Review | VOL. 9

Differentially Private Reinforcement Learning with Linear Function Approximation
Xingyu Zhou
ACM SIGMETRICS Performance Evaluation Review | VOL. 50
Xingyu ZhouXingyu Zhou
20 Jun 2022
ACM SIGMETRICS Performance Evaluation Review | VOL. 50

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cost-Optimal Control of Markov Decision Processes Under Signal Temporal Logic Constraints

Abstract

Talk to us

Similar Papers