Abstract

Dynamic zero-sum games are a model of multiagent decision-making that has been well-studied in the mathematical game theory literature. In this paper, we derive a sufficient condition for the existence of a solution to this problem, and then proceed to discuss various reinforcement learning strategies to solve such a dynamic game in the presence of uncertainty where the game matrices at various states as well as the transition probabilities between the states under different agent actions are unknown. A novel algorithm, based on heterogeneous games of learning automata (HEGLA), as well as algorithms based on model-based and model-free reinforcement learning, are presented as possible approaches to learning the solution Markov equilibrium policies when they are assumed to satisfy the sufficient conditions for existence. The HEGLA algorithm involves automata simultaneously playing zero-sum games with some automata and identical pay-off games with some other automata. Simulation studies are reported to complement the theoretical and algorithmic discussions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call