Recursive adaptive control of Markov decision processes with the average reward criterion

Rolando Cavazos-Cadena,On�Simo Hern�Ndez-Lerma

doi:10.1007/bf01442397

Recursive adaptive control of Markov decision processes with the average reward criterion

Rolando Cavazos-Cadena, On�Simo Hern�Ndez-Lerma

https://doi.org/10.1007/bf01442397

Copy DOI

Journal: Applied Mathematics & Optimization	Publication Date: Jan 1, 1991
Citations: 9

Affiliation: Universidad Autónoma Agraria Antonio Narro

#Adaptive Policies #Control Of Markov Decision Processes + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We are concerned with Markov decision processes with Borel state and action spaces; the transition law and the reward function depend on anunknown parameter. In this framework, we study therecursive adaptive nonstationary value iteration policy, which is proved to be optimal under thesame conditions usually imposed to obtain the optimality of other well-knownnonrecursive adaptive policies. The results are illustrated by showing the existence of optimal adaptive policies for a class of additive-noise systems with unknown noise distribution.

Full Text