Some improvements in Monte Carlo tree search algorithms for sudden death games

Chih-Hung Chen,Yu-Heng Chen,Shun-Shii Lin,Wei-Lin Wu

doi:10.3233/icg-180065

Abstract

The average reward problem in the traditional Monte-Carlo tree search algorithms troubles the sudden death games for a long time, because of the average reward criterion will reduce the probability of a deterministic result. But this does not bother the non-sudden death games, such as Go, which can focus only on higher win rates rather than higher scores. In this work, we propose the miniMax-Monte-Carlo tree search with depth rewards to Outer-Open Gomoku (a variant of Gomoku) to discover a forced win/lose without any human knowledge, evaluation function, or pre-training. And it can solve not only the average reward problem but also the inaccurate win-rate problem in deep playout simulation in sudden death games. Finally, we propose a new integrated framework named BBQ (Big, Best, Quick win) MCTS for improving the performance of traditional MCTS.

Full Text