Finding nash equilibrium for imperfect information games via fictitious play based on local regret minimization

Kangxin He,Hui Li,Haolin Wu,Zhuang Wang

doi:10.1002/int.22837

Abstract

Finding Nash equilibrium in the domain of imperfect information games as a challenging problem has received much attention. Neural Fictitious Self-Play (NFSP) is a popular model-free machine learning algorithm and has computed approximate Nash equilibrium on such games. However, the deep reinforcement learning method used to approximate the best response in NFSP requires reaching a fully observable Markov state, while the states in imperfect information games are partially observable and non-Markovian, which results in a poor approximation of the best response. Thus, NFSP needs more iterations to converge. In this study, we present a new reinforcement learning method that is inspired by counterfactual regret minimization to relax the Markov requirement by iteratively updating policy according to the regret matching process. Combining this new reinforcement learning algorithm with fictitious play, we further present a novel algorithm to find approximate Nash equilibrium in zero-sum imperfect information games. Experimental results in three benchmark games show that this new algorithm can find approximate Nash equilibrium effectively and converge much faster compared with baseline.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Finding nash equilibrium for imperfect information games via fictitious play based on local regret minimization

Abstract

Talk to us

Similar Papers

More From: International Journal of Intelligent Systems

Lead the way for us

Journal: International Journal of Intelligent Systems	Publication Date: Feb 1, 2022
Citations: 6

Similar Papers

RM-FSP: Regret minimization optimizes neural fictitious self-play
Yuxuan Chen ... Zhijie Pan
Neurocomputing | VOL. 549
Yuxuan Chen, et. al.Yuxuan Chen ... Zhijie Pan
15 Jun 2023
Neurocomputing | VOL. 549

Deep Reinforcement Learning from Self-Play in No-limit Texas Hold'em Poker
T.-V Pricope
Studia Universitatis Babeș-Bolyai Informatica | VOL. 66
T.-V PricopeT.-V Pricope
15 Dec 2021
Studia Universitatis Babeș-Bolyai Informatica | VOL. 66

Hardness of Approximation Between P and NP
Aviad Rubinstein
-
Aviad RubinsteinAviad Rubinstein
30 May 2019
30 May 2019

Solving the binary knapsack problem using tabular and deep reinforcement learning algorithms
Samuel Levente Benford
-
Samuel Levente BenfordSamuel Levente Benford
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Finding nash equilibrium for imperfect information games via fictitious play based on local regret minimization

Abstract

Talk to us

Similar Papers

More From: International Journal of Intelligent Systems