Abstract

We consider the access problem in a multichannel opportunistic communication system with imperfect sensing, where the state of each channel evolves as a nonidentical and independently distributed Markov process. This problem can be cast into a restless multiarmed bandit (RMAB) problem, which is intractable for its exponential computation complexity. A promising approach that has attracted much research attention is the consideration of an easily myopic policy that maximizes the immediate reward by ignoring the impact of the current policy on future reward. Specially, we formalize a family of generic functions, which is referred to as $g$ -regular functions, characterized by three axioms, and then establish a set of closed-form conditions for the optimality of the myopic policy and illustrate the engineering implications behind the obtained results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call