Efficient Opponent Exploitation in No-Limit Texas Hold’em Poker: A Neuroevolutionary Method Combined with Reinforcement Learning

Jiahui Xu,Shaofei Chen,Jing Chen

doi:10.3390/electronics10172087

Jiahui Xu, Shaofei Chen + Show 1 more

Open Access

https://doi.org/10.3390/electronics10172087

Copy DOI

Journal: Electronics	Publication Date: Aug 28, 2021
Citations: 3	License type: CC BY 4.0

Affiliation: National University of Defense Technology

Abstract

In the development of artificial intelligence (AI), games have often served as benchmarks to promote remarkable breakthroughs in models and algorithms. No-limit Texas Hold’em (NLTH) is one of the most popular and challenging poker games. Despite numerous studies having been conducted on this subject, there are still some important problems that remain to be solved, such as opponent exploitation, which means to adaptively and effectively exploit specific opponent strategies; this is acknowledged as a vital issue especially in NLTH and many real-world scenarios. Previous researchers tried to use an off-policy reinforcement learning (RL) method to train agents that directly learn from historical strategy interactions but suffered from challenges of sparse rewards. Other researchers instead adopted neuroevolutionary (NE) method to replace RL for policy parameter updates but suffered from high sample complexity due to the large-scale problem of NLTH. In this work, we propose NE_RL, a novel method combing NE with RL for opponent exploitation in NLTH. Our method contains a hybrid framework that uses NE’s advantage of evolutionary computation with a long-term fitness metric to address the sparse rewards feedback in NLTH and retains RL’s gradient-based method for higher learning efficiency. Experimental results against multiple baseline opponents have proved the feasibility of our method with significant improvement compared to previous methods. We hope this paper provides an effective new approach for opponent exploitation in NLTH and other large-scale imperfect information games.

Highlights

Poker is often regarded as a representative problem for the branch of imperfect information games in game theory
We propose a novel method combining neuroevolution (NE) with reinforcement learning (RL) for opponent exploitation in No-limit Texas Hold’em (NLTH)
They need a large amount of computing resources to obtain so-called equilibrium solutions, and these equilibrium solutions do not take into account any advantage of opponents’ weakness that can be exploited, which corresponds to poor dynamic adaptiveness [14]

Summary

Introduction

Poker is often regarded as a representative problem for the branch of imperfect information games in game theory. Texas Hold’em poker contains additional challenges of imperfect information, dynamic decision-making, and misleading deceptions, as well as multistage chip and risk management, etc., which restrict it from being solved perfectly by AI. One popular approach to achieve this goal is equilibrium-based solutions, which include the most part of state-of-the-art algorithms [2,5,6,13]. We can consider one’s goal as learning to play and maximizing one’s rewards against some specific opponent groups through repeated strategic interactions (which is exactly the core of NLTH). In such a case, an equilibrium strategy is perhaps not so optimal and this is the problem that opponent exploitation mainly deals with. These types of methods rely very much on the accuracy of identification, which is as difficult (if not more) as solving the game itself and requires either sufficient domain knowledge or a mass of labeled data

Objectives

Methods

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient Opponent Exploitation in No-Limit Texas Hold’em Poker: A Neuroevolutionary Method Combined with Reinforcement Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

Reinforcement Learning for Developing an Intelligent Warehouse Environment
Van Luan Tran ... Tran-Thuy-Duong Ninh
-
Van Luan Tran, et. al.Van Luan Tran ... Tran-Thuy-Duong Ninh
01 Jan 2021
01 Jan 2021

Artificial Intelligence and the Common Sense of Animals.
Murray Shanahan ... Lucy Cheke
Trends in Cognitive Sciences | VOL. 24
Murray Shanahan, et. al.Murray Shanahan ... Lucy Cheke
08 Oct 2020
Trends in Cognitive Sciences | VOL. 24

A Preliminary Study of the Influence of Artificial Intelligence on Globalization
Chang Wang
Journal of Electronic Research and Application | VOL. 3
Chang WangChang Wang
20 Dec 2019
Journal of Electronic Research and Application | VOL. 3

ChatGPT Isn't Magic
Tama Leaver ... Suzanne Srdarov
M/C Journal | VOL. 26
Tama Leaver, et. al.Tama Leaver ... Suzanne Srdarov
02 Oct 2023
M/C Journal | VOL. 26

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient Opponent Exploitation in No-Limit Texas Hold’em Poker: A Neuroevolutionary Method Combined with Reinforcement Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics