Abstract

In this paper, We consider an uncertain Markov decision process (MDP) with a control cost and a linear temporal logic (LTL) control specification. We propose a reinforcement learning (RL) based method for design of an optimal control policy by which the controlled MDP satisfies the control specification with probability 1 and minimizes an expected discounted sum of the control costs. First, we construct a deterministic Rabin automaton (DRA) that accepts all and only infinite words satisfying the LTL control specification. Second, we construct a product MDP of the MDP and the DRA to represent a dynamic control policy that satisfies the control specification. Third, we modify the product MDP in order to apply RL to the design of an optimal control policy. The control action of the modified product MDP is a pair of a pattern and an action, where the pattern is a set of actions. Moreover, we introduce a reward that represents both the satisfaction of the control specification and the minimally restrictiveness of the pattern. Finally, we proposed an algorithm for design of an optimal control policy that consists of a sequential decision making of two steps. At the first decision making, we select a pattern that maximizes a discounted sum of the reward. At the second one, we select an action from the pattern selected at the first one such that it minimizes the expected discounted sum of the costs. Moreover, we consider an illustrative example to show that the proposed algorithm can obtain an optimal control policy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.