A Provably-Efficient Model-Free Algorithm for Infinite-Horizon Average-Reward Constrained Markov Decision Processes

Honghao Wei,Lei Ying,Xin Liu

doi:10.1609/aaai.v36i4.20302

A Provably-Efficient Model-Free Algorithm for Infinite-Horizon Average-Reward Constrained Markov Decision Processes

Honghao Wei, Lei Ying + Show 1 more

Open Access

PDF Available

https://doi.org/10.1609/aaai.v36i4.20302

Copy DOI

Export

Save

Cite

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jun 28, 2022
Citations: 5

Affiliation: University of Michigan–Ann Arbor, ShanghaiTech University

#Constrained Markov Decision Processes #Learning Horizon #Model-free Reinforcement Learning Algorithm #Reinforcement Learning #Reinforcement Learning Algorithm #Learning Algorithm #Model-free Reinforcement Learning #Sublinear Regret #Constrained Markov #Markov Decision Processes

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

This paper presents a model-free reinforcement learning (RL) algorithm for infinite-horizon average-reward Constrained Markov Decision Processes (CMDPs). Considering a learning horizon K, which is sufficiently large, the proposed algorithm achieves sublinear regret and zero constraint violation. The bounds depend on the number of states S, the number of actions A, and two constants which are independent of the learning horizon K.

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.

R Discovery Prime

A Provably-Efficient Model-Free Algorithm for Infinite-Horizon Average-Reward Constrained Markov Decision Processes