Abstract

We study a dynamic optimal routing problem, where a controller has to decide to which of two queues should arriving customers (representing packets, messages, calls, etc.) be sent. The service rate in each queue may depend on the state of the system, may change in time, and is unknown to the controller. The goal of the controller is to design a strategy that guarantees the best performance under the worst case service conditions. The payoff is composed of a holding cost, an admission cost, and a cost that depends on the quality of the service. We consider both the finite and infinite horizon discounted cost. The problem is studied in the framework of zero-sum Markov games where the server, called player 1, is assumed to play against the router, called player 2. Each player is assumed to have the information of all previous actions of both players as well as the current and past states of the system. We show that there exist pure optimal strategies for both players. A value iteration algorithm is used to establish properties of the value of the game, which are related to supermodularity and to convexity. This is then shown to imply the existence of optimal strategies described by monotone switching curves for both players.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call