Abstract

We consider learning-based variants of the cμ rule for scheduling in single and parallel server settings of multiclass queueing systems.In the single server setting, the cμ rule is known to minimize the expected holding-cost (weighted queue-lengths summed over classes and a fixed time horizon). We focus on the problem where the service rates μ are unknown with the holding-cost regret (regret against the cμ rule with known μ) as our objective. We show that the greedy algorithm that uses empirically learned service rates results in a constant holding-cost regret (the regret is independent of the time horizon). This free exploration can be explained in the single server setting by the fact that any work-conserving policy obtains the same number of samples in a busy cycle.In the parallel server setting, we show that the cμ rule may result in unstable queues, even for arrival rates within the capacity region. We then present sufficient conditions for geometric ergodicity under the cμ rule. Using these results, we propose an almost greedy algorithm that explores only when the number of samples falls below a threshold. We show that this algorithm delivers constant holding-cost regret because a free exploration condition is eventually satisfied.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call