Abstract

We consider a class of discrete-time two person zero-sum Markov games with Borel state and action spaces, and possibly unbounded payoffs. The game evolves according to the recursive equation $x_{n+1}=F(x_{n},a_{n},b_{n},\xi_{n})$, $n=0,1,\dots$, where the disturbance process $\{\xi_{n}\}$ is formed by independent and identically distributed $\mathbb{R}^{k}$-valued random vectors, which are observable but whose common density $\rho$ is unknown to both players. Under certain continuity and compactness conditions, we combine a nonstationary iteration procedure and suitable density estimation methods to construct asymptotically discounted optimal strategies for both players.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call