Abstract

We consider a communication system where multiple unknown channels are available for transmission. Each channel is a channel with state which evolves in a Markov manner. The transmitter has to select L channels to use and also decide the resources (e.g., power, rate, etc.) to use for each of the selected channels. It observes the state of the channels it uses and receives no feedback on the state of the other channels. We model this problem as a partially observable Markov decision process and obtain a simplified belief state. We show that the optimal resource allocation policy can be identified in closed form. Once the optimal resource allocation policy is fixed, choosing the channel scheduling policy may be viewed as a restless bandit. We present an efficient algorithm to check indexability and compute the Whittle index for each channel. When the model is indexable, the Whittle index policy, which transmits over the L channels with the smallest Whittle indices, is an attractive heuristic policy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call