This paper considers a parallel queueing system with multiple stations, each of which contains many statistically identical servers and has a dedicated queue. Upon each customer arrival, the system manager must decide to which station the customer should be routed, with the objective of minimizing the system’s long-run average delay cost. One feature of this paper is that a customer’s delay cost depends not only on his/her delay, but also on the routed station. Considering this heterogeneity across stations, we propose a routing policy, which can be regarded as an extension of the MED–FSF policy. Under this policy, any arriving customer will be routed to: (i) the station with the minimum value, which depends on the station’s expected delay and the station index when servers in all stations are fully occupied; or otherwise (ii) the station with a fastest idle server. Using asymptotic analysis, we derive diffusion limits of queue-length processes and their stationary distributions under the proposed policy in the Halfin–Whitt regime. Combined with an asymptotic lower bound result for the long-run average delay cost, we show that the proposed routing policy is asymptotically optimal under the considered objective. Finally, we provide numerical experiments to validate the accuracy of our diffusion approximation, and we compare the performance metrics under the proposed policy with those under other commonly used routing policies.
Read full abstract