ON THE LEARNING ALGORITHM OF 2-PERSON ZERO-SUM MARKOV GAME WITH EXPECTED AVERAGE REWARD CRITERION

Kensuke Tanaka

doi:10.5109/13364

ON THE LEARNING ALGORITHM OF 2-PERSON ZERO-SUM MARKOV GAME WITH EXPECTED AVERAGE REWARD CRITERION

Kensuke Tanaka

Open Access

https://doi.org/10.5109/13364

Copy DOI

Journal: Bulletin of informatics and cybernetics	Publication Date: Mar 1, 1985
License type: other-oa

Affiliation: Niigata University

#Probability Of Error #Optimal Strategies + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We develop a method for learning the optimal strategies of 2-person zero-sum Markov game with expected average reward criterion. To do this, at each stage the players play a modified matrix game with relation to each state, and then receive an information about the result of the game from a teacher. Using the information, the players generate a pair of mixed strategies with relation to each state used at next stage. Then, such a pair of mixed strategies generated by the players converges with probability one and in mean square to a pair of the optimal stationary strategies. Further, when the learning is stopped at some stage by the teacher, the probability of error is estimated.

Full Text