On the Impossibility of Convergence of Mixed Strategies with Optimal No-Regret Learning

Vidya Muthukumar,Soham Phade,Anant Sahai

doi:10.1287/moor.2022.0016

Abstract

We study the limiting behavior of the mixed strategies that result from optimal no-regret learning in a repeated game setting where the stage game is any 2 × 2 competitive game. We consider optimal no-regret algorithms that are mean-based and monotonic in their argument. We show that for any such algorithm, the limiting mixed strategies of the players cannot converge almost surely to any Nash equilibrium. This negative result is also shown to hold under a broad relaxation of these assumptions, including popular variants of Follow-the-Regularized Leader with optimism or adaptive step sizes. Finally, we provide partial evidence that the monotonicity and mean-based assumptions can be removed or relaxed. Our results identify the inherent stochasticity in players’ realizations as a critical factor underlying this divergence, and demonstrate a crucial difference in outcomes between using the opponent’s mixtures and realizations to make updates. Funding: V. Muthukumar was supported by a Simons-Berkeley Research Fellowship, NSF awards IIS-2212182 and CCF-2239151, and generous gifts from Amazon, Adobe, and Google. S. Phade acknowledges the support of the NSF [Grants CNS-1527846 and CCF-1618145] and the NSF Science & Technology Center [Grant CCF-0939370 (Science of Information)]. A. Sahai acknowledges the support of the ML4Wireless center member companies and the NSF [Grants AST-144078 and ECCS-1343398].

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

On the Impossibility of Convergence of Mixed Strategies with Optimal No-Regret Learning

Abstract

Talk to us

Similar Papers

More From: Mathematics of Operations Research

Lead the way for us

Similar Papers

Competitive and Cooperative Games of Variable-Structure Stochastic Automata
Kumpati S Narendra†
Journal of Cybernetics | VOL. 3
Kumpati S Narendra†Kumpati S Narendra†
01 Jan 1973
Journal of Cybernetics | VOL. 3

A novel importance measure considering multi-constraints for RAP optimization of 1-out-of-n subsystems with mixed redundancy strategy
Dan Wang ... Shubin Si
Reliability Engineering and System Safety | VOL. 252
Dan Wang, et. al.Dan Wang ... Shubin Si
21 Aug 2024
Reliability Engineering and System Safety | VOL. 252

Discretionary Sanctions and Rewards in the Repeated Inspection Game
Daniele Nosenzo ... Martin Sefton
Management Science | VOL. 62
Daniele Nosenzo, et. al.Daniele Nosenzo ... Martin Sefton
01 Feb 2016
Management Science | VOL. 62

Analysis on the latest research hotspots of computer deep learning optimization algorithms
Xiaokang Guo
Highlights in Science, Engineering and Technology | VOL. 9
Xiaokang GuoXiaokang Guo
30 Sep 2022
Highlights in Science, Engineering and Technology | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the Impossibility of Convergence of Mixed Strategies with Optimal No-Regret Learning

Abstract

Talk to us

Similar Papers

More From: Mathematics of Operations Research