Strategic decision-making in imperfect-information games is an important problem in artificial intelligence. Counterfactual regret minimization (CFR), a family of iterative algorithms, has been the workhorse for solving these types of games since its inception. In recent years, a series of novel CFR variants have been proposed, significantly improving the convergence rate of vanilla CFR. However, most of these new variants are hand-designed by researchers through trial and error, often based on different motivations, which generally requires a tremendous amount of effort and insight. This work proposes AutoCFR, a systematic framework that meta-learns novel CFR algorithms through evolution, easing the burden of manual algorithm design. We first design a search language that is rich enough to represent various CFR variants. We then exploit a scalable regularized evolution algorithm with a set of acceleration techniques to efficiently search over the combinatorial space of algorithms defined by this language. The learned novel CFR algorithm can generalize to new imperfect-information games not seen during training and performs on par with or better than existing state-of-the-art CFR variants. In addition to superior empirical performance, we also theoretically show that the learned algorithm converges to an approximate Nash equilibrium. Extensive experiments across diverse imperfect-information games highlight the scalability, extensibility, and generalizability of AutoCFR, establishing it as a general-purpose framework for solving imperfect-information games.
Read full abstract