Abstract
We consider the model of a token-based joint autoscaling and load-balancing strategy, proposed in a recent paper by Mukherjee et al. [Mukherjee D, Dhara S, Borst SC, Van Leeuwaarden JSH (2017) Optimal service elasticity in large-scale distributed systems. Proc. ACM Measurement Anal. Comput. Systems 1(1):25:1–25:28.], which offers an efficient scalable implementation and yet achieves asymptotically optimal steady-state delay performance and energy consumption as the number of servers N → ∞. In the aforementioned work, the asymptotic results are obtained under the assumption that the queues have fixed-size finite buffers, and therefore, the fundamental question of stability of the proposed scheme with infinite buffers was left open. In this paper, we address this fundamental stability question. The system stability under the usual subcritical load assumption is not automatic. Moreover, the stability may not even hold for all N. The key challenge stems from the fact that the process lacks monotonicity, which has been the powerful primary tool for establishing stability in load-balancing models. We develop a novel method to prove that the subcritically loaded system is stable for large enough N and establish convergence of steady-state distributions to the optimal one as N → ∞. The method goes beyond the state-of-the-art techniques; it uses an induction-based idea and a “weak monotonicity” property of the model. This technique is of independent interest and may have broader applicability.
Highlights
We develop a novel method for proving large-N stability for subcritically loaded systems, and using that, we establish the convergence of the sequence of steady-state distributions as N → ∞
We studied the stability of systems under the token-based auto balance scaling (TABS) scheme and established large-scale asymptotics of the sequence of steady states
Understanding stability of stochastic systems is of fundamental importance
Summary
A big proportion of the tasks processed by these data centers come with business-critical performance requirements This forces service providers to increase their capacity at a tremendous rate to cope with the high-demand period in the presence of a time-varying demand pattern. The study of load-balancing schemes in large-scale systems has a very rich history, and for decades, a lot of research has been conducted in understanding the fundamental trade-off between delay performance and communication overhead per task. A token-based, joint load-balancing and autoscaling scheme called token-based auto balance scaling (TABS) was proposed by Mukherjee et al (2017), and it offers an efficient scalable implementation and yet achieves asymptotically optimal steady-state delay performance and energy consumption as the number of servers N → ∞. It remains an important open challenge to understand the stability property of the TABS scheme without the finite-buffer restriction
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have