Kdb-D2CFR: Solving Multiplayer imperfect-information games with knowledge distillation-based DeepCFR

Huale Li,Zengyue Guo,Yang Liu,Xuan Wang,Shuhan Qi,Jiajia Zhang,Jing Xiao

doi:10.1016/j.knosys.2023.110567

Abstract

Counterfactual regret minimization (CFR) is a popular method for finding approximate Nash equilibrium in imperfect-information games (IIG). However, CFR based methods for the IIG are either only designed for two-player IIGs or require much expert knowledge. In this paper, towards to solve the multiplayer IIG problem without utilizing much expert knowledge, we proposed a practical knowledge distillation based framework, which aims to transferring the knowledge from model of two-player IIG into the multiplayer one. By this framework, both of the training efficiency and performance is improved. To eliminate the requirement of expert knowledge, here we introduced a deep learning based CFR in the framework, by which the counterfactual value of CFR can be estimated in an end-to-end way without any expert knowledge and abstraction. We further propose kdb-DeepCFR and kdb-D2CFR based on DeepCFR and D2CFR respectively, which can effectively solve the strategy of multiplayer large-scale game problems. The extensive experiments conducted on 3-8 players poker games suggest that our method outperforms other baselines in the game performance.

Full Text