Improved learning efficiency of deep Monte-Carlo for complex imperfect-information card games

Qian Luo,Tien-Ping Tan

doi:10.1016/j.asoc.2024.111545

Abstract

Deep Reinforcement Learning (DRL) has achieved considerable success in games involving perfect and imperfect information, such as Go, Texas Hold’em, Stratego, and DouDiZhu. Nevertheless, training a state-of-the-art model for complex imperfect-information card games like DouDiZhu and Big2 remains resource and time-intensive. To address this challenge, this paper introduces two innovative methods: the Opponent Model and Optimized Deep Monte-Carlo (ODMC). These methods are designed to improve the training efficiency of Deep Monte-Carlo (DMC) for imperfect-information card games. The Opponent Model predicts hidden information, enhancing the agent’s learning speed in DMC compared to the original training that only utilizes observed information as input features. In ODMC, the Minimum Combination Search (MCS) is a heuristic search algorithm based on dynamic programming. It calculates the minimum combination of actions in the current state, and ODMC uses MCS to filter suboptimal actions in each state. This reduces the action space considered by DMC, resulting in faster training that focuses on evaluating the most promising actions. The effectiveness of the proposed approach is evaluated by examining two complex card games with imperfect information: DouDiZhu and Big2. Ablation experiments are conducted to evaluate both the Opponent Model (D+OM and B+OM) and ODMC (D+ODMC and B+ODMC), along with their combined variants (D+OMODMC and B+OMODMC). Furthermore, D+OMODMC and B+OMODMC are compared with state-of-the-art DouDiZhu and Big2 artificial intelligence (AI) programs, respectively. The experimental results demonstrate that the proposed methods achieve comparable performance to the original DMC, but with only 25.5% of the training time on the same device. These findings are valuable for mitigating both the equipment requirements and training time in complex imperfect-information card games.

Full Text