Abstract

In this paper, baseball is formulated as a finite Markov game with approximately 6.45 million states. We give an effective dynamic programming algorithm which computes Markov perfect equilibria and the value functions of the game for both teams in 2 second per game. Optimal decision making can be found depending on the situation—for example, for the batting team, whether batting for a hit, stealing a base or sacrifice bunting will maximize their win percentage, or for the fielding team, whether to pitch to or intentionally walk a batter, yields optimal results. In addition, our algorithm makes it possible to compute the optimal batting order, in consideration of strategy optimization such as a sacrifice bunt or a stolen base. The authors believe that this baseball model is also useful as a benchmark instance for evaluating the performances of (multi-agent) Reinforcement Learning methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call