Random search for constrained Markov decision processes with multi-policy improvement

Hyeong Soo Chang

doi:10.1016/j.automatica.2015.05.016

Random search for constrained Markov decision processes with multi-policy improvement

Hyeong Soo Chang

https://doi.org/10.1016/j.automatica.2015.05.016

Copy DOI

Journal: Automatica	Publication Date: Jun 2, 2015
Citations: 3

Affiliation: Sogang University

#Constrained Markov Decision Processes #Feasible Policy + Show 8 more

Abstract
Full-Text
Similar Papers

Abstract

This communique first presents a novel multi-policy improvement method which generates a feasible policy at least as good as any policy in a given set of feasible policies in finite constrained Markov decision processes (CMDPs). A random search algorithm for finding an optimal feasible policy for a given CMDP is derived by properly adapting the improvement method. The algorithm alleviates the major drawback of solving unconstrained MDPs at iterations in the existing value-iteration and policy-iteration type exact algorithms. We establish that the sequence of feasible policies generated by the algorithm converges to an optimal feasible policy with probability one and has a probabilistic exponential convergence rate.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Automatica

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.