Density estimation and adaptive control of markov processes: Average and discounted criteria

On�Simo Hern�Ndez-Lerma,Rolando Cavazos-Cadena

doi:10.1007/bf00049572

Abstract

We consider a class of discrete-time Markov control processes with Borel state and action spaces, and ℝd i.i.d. disturbances with unknown distribution μ. Under mild semi-continuity and compactness conditions, and assuming that μ is absolutely continuous with respect to Lebesgue measure, we establish the existence of adaptive control policies which are (1) optimal for the average-reward criterion, and (2) asymptotically optimal in the discounted case. Our results are obtained by taking advantage of some well-known facts in the theory of density estimation. This approach allows us to avoid restrictive conditions on the state space and/or on the system's transition law imposed in recent works, and on the other hand, it clearly shows the way to other applications of nonparametric (density) estimation to adaptive control.

Full Text