Abstract

Consider a slot-based opportunistic communication system consisting of one transmitter, one receiver, and N two-state Markov channels. In each K continuous time slots on a large time scale, the transmitter probes one of N channels and chooses one to access in each time slot of the K time slots on a small time scale. For each successful access, one unit of reward is obtained. To maximize the cumulated reward over a time horizon of T, the joint probing (on a large time scale) and accessing (on a small time scale) problem can be cast into a mixed-scale partially observable Markovian decision process which is proved to PSPACE-Hard. Then the mixed-scale sequential decision-making problem is simplified into a probing decision problem on a large time scale. Considering the huge computing complexity of the large-scale probing decision, we present a simple heuristic policy which is to probe the best or the second-best channel in terms of available probability under different probing conditions regarding missing detection rate and false alarm one. Next, we derive several sets of sufficient conditions for different scenarios under which the proposed heuristic policy is optimal. Finally, the results of numerical experiments verify our theoretical analysis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call