No-regret learning for repeated non-cooperative games with lossy bandits

Wenting Liu,Jinlong Lei,Peng Yi,Yiguang Hong

doi:10.1016/j.automatica.2023.111455

Abstract

This paper considers no-regret learning for repeated continuous-kernel games with lossy bandit feedback. Since it is difficult to give an explicit model of the utility functions in dynamic environments, the players’ actions can only be learned with bandit feedback. Moreover, due to unreliable communication channels or privacy protection, the bandit feedback may be lost or dropped at random. Therefore, we study the asynchronous online learning strategy of the players to adaptively adjust the next actions for minimizing the long-term regret loss. The paper provides a novel no-regret learning algorithm, called Online Gradient Descent with lossy bandits (OGD-lb). We first give the regret analysis for concave games with differentiable and Lipschitz utilities. Then we show that the action profile converges to a Nash equilibrium with probability 1 when the game is also strictly monotone. We further provide the mean-squared convergence rate ONpi−2k−1/3 when the game is β-strongly monotone, where N denotes the number of players and pi is the update probability. In addition, we extend the algorithm to the case when the loss probability of the bandit feedback is unknown, and prove its almost sure convergence to Nash equilibrium for strictly monotone games. Finally, we take the resource management in fog computing as an application example, and carry out numerical experiments to empirically demonstrate the algorithm performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

No-regret learning for repeated non-cooperative games with lossy bandits

Abstract

Talk to us

Similar Papers

More From: Automatica

Lead the way for us

Journal: Automatica	Publication Date: Dec 8, 2023
Citations: 1

Similar Papers

No-regret learning for repeated concave games with lossy bandits
Wenting Liu ... Jinlong Lei
-
Wenting Liu, et. al.Wenting Liu ... Jinlong Lei
14 Dec 2021
14 Dec 2021

Optimal No-Regret Learning in Strongly Monotone Games with Bandit Feedback
Tianyi Lin ... Wenjia Ba
SSRN Electronic Journal | VOL. -
Tianyi Lin, et. al.Tianyi Lin ... Wenjia Ba
01 Jan 2020
SSRN Electronic Journal | VOL. -

Review on QoS Aware Resource Management in Fog Computing Environment
Hemant Kumar Apat ... Prasenjit Maiti
-
Hemant Kumar Apat, et. al.Hemant Kumar Apat ... Prasenjit Maiti
16 Dec 2020
16 Dec 2020

Learning in Auctions: Regret is Hard, Envy is Easy
Constantinos Daskalakis ... Vasilis Syrgkanis
-
Constantinos Daskalakis, et. al.Constantinos Daskalakis ... Vasilis Syrgkanis
01 Oct 2016
01 Oct 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

No-regret learning for repeated non-cooperative games with lossy bandits

Abstract

Talk to us

Similar Papers

More From: Automatica