Abstract

We consider a cognitive radio network where M secondary users compete with each other to access one of the N available channels. Channel availability statistics are assumed to evolve as i.i.d. Bernoulli random processes with means unknown to the secondary users. In addition, the number of secondary users M is unknown to each user. The main objective here is to design a distributed online learning and access policy which maximizes the total throughput of the secondary users. It has previously been shown that this problem can elegantly be modeled as a decentralized multi-armed bandit (DMAB) problem when M is known. We propose a truly decentralized online learning algorithm based on DMAB problem for unknown M. We show that using distributed access policies with wrong knowledge of M results in linear growth of regret, and underestimation incurs more significant loss than overestimation does. For distributed online learning of M, we propose a dynamic thresholding method, where the thresholds are dynamically determined using virtual systems built upon the current estimates of mean channel availabilities. Our algorithm allows both overestimation and underestimation in estimating M over time, and thus is capable of tracking the population change of secondary users.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call