Multi-Player Multi-Armed Bandits with Finite Shareable Resources Arms: Learning Algorithms &amp; Applications

Xuchuang Wang,Hong Xie,John C S Lui

doi:10.24963/ijcai.2022/491

Abstract

Multi-player multi-armed bandits (MMAB) study how decentralized players cooperatively play the same multi-armed bandit so as to maximize their total cumulative rewards. Existing MMAB models mostly assume when more than one player pulls the same arm, they either have a collision and obtain zero rewards or have no collision and gain independent rewards, both of which are usually too restrictive in practical scenarios. In this paper, we propose an MMAB with shareable resources as an extension of the collision and non-collision settings. Each shareable arm has finite shareable resources and a “per-load” reward random variable, both of which are unknown to players. The reward from a shareable arm is equal to the “per-load” reward multiplied by the minimum between the number of players pulling the arm and the arm’s maximal shareable resources. We consider two types of feedback: sharing demand information (SDI) and sharing demand awareness (SDA), each of which provides different signals of resource sharing. We design the DPE-SDI and SIC-SDA algorithms to address the shareable arm problem under these two cases of feedback respectively and prove that both algorithms have logarithmic regrets that are tight in the number of rounds. We conduct simulations to validate both algorithms’ performance and show their utilities in wireless networking and edge computing.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-Player Multi-Armed Bandits with Finite Shareable Resources Arms: Learning Algorithms & Applications

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Decentralized Stochastic Multi-Player Multi-Armed Walking Bandits
Guojun Xiong ... Jian Li
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 37
Guojun Xiong, et. al.Guojun Xiong ... Jian Li
26 Jun 2023
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 37

A Practical Multiplayer Multi-armed bandit Algorithm for Smart City Communication System
Shubhjeet Kumar Tiwari ... Sudhanshu Soni
-
Shubhjeet Kumar Tiwari, et. al.Shubhjeet Kumar Tiwari ... Sudhanshu Soni
19 Mar 2021
19 Mar 2021

Fair Resource Reusing for D2D Communication Based on Reinforcement Learning
Fang-Chang Kuo ... Hwang-Cheng Wang
-
Fang-Chang Kuo, et. al.Fang-Chang Kuo ... Hwang-Cheng Wang
01 Jan 2020
01 Jan 2020

Enhanced Dynamic Spectrum Access in UAV Wireless Networks for Post-Disaster Area Surveillance System: A Multi-Player Multi-Armed Bandit Approach.
Amr Amrallah ... Ehab Mahmoud Mohamed
Sensors | VOL. 21
Amr Amrallah, et. al.Amr Amrallah ... Ehab Mahmoud Mohamed
25 Nov 2021
Sensors | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-Player Multi-Armed Bandits with Finite Shareable Resources Arms: Learning Algorithms &amp; Applications

Abstract

Talk to us

Similar Papers

Multi-Player Multi-Armed Bandits with Finite Shareable Resources Arms: Learning Algorithms & Applications