Abstract

We consider the channel access problem arising in opportunistic scheduling over fading channels, cognitive radio networks, and server scheduling. The multi-channel communication system consists of $N$ channels. Each channel evolves as a time-nonhomogeneous multi-state Markov process. At each time instant, a user chooses $M$ channels to transmit information, and obtains some reward, i.e., throughput, based on the states of the chosen channels. The objective is to design an access policy, i.e., which channels should be accessed at each time instant, such that the expected accumulated discounted reward is maximised over a finite or infinite horizon. The considered problem can be cast into a restless multi-armed bandit (RMAB) problem, which is PSPACE-hard, with the optimal policy usually intractable due to the exponential computation complexity. Hence, a natural alternative is to consider the easily implementable myopic policy that only maximises the immediate reward but ignores the impact of the current strategy on the future reward. In this paper, we perform an analytical study on the performance of the myopic policy for the considered RMAB problem, and establish a set of closed-form conditions to guarantee the optimality of the myopic policy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call