We study equilibrium approximation in extensive-form adversarial team games, in which two teams of rational players compete in a zero-sum interaction. The suitable solution concept in these settings is the Team-Maxmin Equilibrium with Correlation (TMECor), which naturally arises when the team players play ex-ante correlated strategies. While computing such an equilibrium is APX-hard, recent techniques show that scalability beyond toy instances is possible. However, even compact representations of the team’s strategy space, such as that exploiting Directed Acyclic Graphs (DAGs), have exponential size prohibiting solving large instances. In the present paper, we show that Monte Carlo sampling for regret minimization in adversarial team games can provide an important advancement. In particular, we design a DAG Monte Carlo Counterfactual Regret Minimization algorithm that performs outcome sampling with O ( d ) time complexity per iteration, where d is the depth of the DAG, and with a convergence rate bound of O ( b kd ) , where b is the branching factor and k is the maximum number of private states in each public state of the team. We empirically evaluate our algorithms with a standard testbed of games, showing their performance when approximating equilibria. We investigate both cases in which only one team is composed of multiple players and cases in which both teams are composed of multiple players. Our empirical results show that our algorithms can provide a rough approximation to equilibrium strategies in all game instances considered, trading off a lower precision in the computed equilibrium for lower resource utilization.
Read full abstract