Abstract
We are concerned with two-person zero-sum Markov games with Borel spaces under a long-run average criterion. The payoff function is possibly unbounded and depends on a parameter which is unknown to one of the players. The parameter and the payoff function can be estimated by implementing statistical methods. Thus, our main objective is to combine such estimation procedure with a variant of the so-called vanishing discount approach to construct an average optimal pair of strategies for the game. Our results are applied to a class of zero-sum semi-Markov games.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have