Abstract

We argue that the existing regret matchings for Nash equilibrium approximation conduct “jumpy” strategy updating when the probabilities of future plays are set to be proportional to positive regret measures. We propose a geometrical regret matching that features “smooth” strategy updating. Our approach is simple, intuitive, and natural. The analytical and numerical results show that “smoothly” suppressing “unprofitable” pure strategies is sufficient for the game to evolve toward Nash equilibrium, suggesting that, in reality, the tendency for equilibrium could be pervasive and irresistible. Technically, iterative regret matching gives rise to a sequence of adjusted mixed strategies for us to examine its approximation to the true equilibrium point. The sequence can be studied in the metric space and visualized nicely as a clear path toward an equilibrium point. Our theory has limitations in optimizing the approximation accuracy.

Highlights

  • The players keep tracking the regrets on the past plays and making the future plays with probabilities proportional to positive regret measures

  • Iterative regret matching can be seen as continuing updating of mixed strategy with regret information: the mixed strategy to update is a statistical structure of the whole past plays, and the mixed strategy updated will determine the probabilities of plays in the immediate future

  • To approximate a Nash equilibrium of the noncooperative game, we propose and test a regret matching with geometrical flavor

Read more

Summary

INTRODUCTION

In 2000, Hart and Mas-Colell proposed an iterative algorithm called regret matching to approximate a correlated equilibrium. The players keep tracking the regrets on the past plays and making the future plays with probabilities proportional to positive regret measures. The players keep tracking the regrets on the past plays and making the future plays with probabilities proportional to positive regret measures. This algorithm is natural in that the players do not have to ply about their opponents’ payoff functions, as opposed to the non-adaptive variety, e.g., the celebrated Lemke–Howson algorithm, which takes two players’ payoff matrices as input and pinpoints equilibrium points as output.

NON-COOPERATIVE GAME AND REGRET
REGRET MATCHING AND FIXED POINT ITERATION
Function ψi has a geometrical interpretation
Function ψi has a behavioral interpretation
GAME EXAMPLES AND VISUALIZATIONS
EQPT APPROXIMATION ACCURACY
CONCLUSIONS
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.