Abstract

We consider a collection of statistically identical two-state continuous time Markov chains (channels). A controller continuously selects a channel with the view of maximizing infinite horizon average reward. A switching cost is paid upon channel changes. We consider two cases: full observation (all channels observed simultaneously) and partial observation (only the current channel observed). We analyze the difference in performance between these cases for various policies. For the partial observation case with two channels or an infinite number of channels, we explicitly characterize an optimal threshold for two sensible policies which we name “call-gapping” and “cool-off.” Our results present a qualitative view on the interaction of the number of channels, the available information, and the switching costs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call