A Provably-Efficient Model-Free Algorithm for Infinite-Horizon Average-Reward Constrained Markov Decision Processes

Honghao Wei,Xin Liu,Lei Ying

doi:10.1609/aaai.v36i4.20302

A Provably-Efficient Model-Free Algorithm for Infinite-Horizon Average-Reward Constrained Markov Decision Processes

Honghao Wei, Xin Liu + Show 1 more

Open Access

https://doi.org/10.1609/aaai.v36i4.20302

Copy DOI

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: Jun 28, 2022
Citations: 4

Affiliation: University of Michigan–Ann Arbor, ShanghaiTech University

#Constrained Markov Decision Processes #Learning Horizon + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This paper presents a model-free reinforcement learning (RL) algorithm for infinite-horizon average-reward Constrained Markov Decision Processes (CMDPs). Considering a learning horizon K, which is sufficiently large, the proposed algorithm achieves sublinear regret and zero constraint violation. The bounds depend on the number of states S, the number of actions A, and two constants which are independent of the learning horizon K.

Full Text