Abstract
A distinctive feature of cloud computing is that it enables customers to dynamically summon server instances. Service providers facing uncertain demand patterns may exploit this feature by setting automatic provisioning rules for right-sizing the capacity contracted from the cloud. This situation can be modeled by a queueing system where the numbers of both jobs and servers evolve in time, the latter subject to delays in creation and deletion. We study in this context different feedback rules with the objective of efficiently matching capacity and load, while simultaneously providing a high quality of service.These rules are analyzed by means of fluid and diffusion limits for Markov chains. In particular we develop suitable extensions of the classical literature on this topic, required to accommodate non-homogeneous intensity scalings and non-differentiable drift fields. With these tools, our final proposal is shown to exhibit properties akin to the Halfin–Whitt regime, achieved automatically without knowledge of the system load. We further investigate by simulation its behavior under time-varying load, demonstrating the capabilities of our design to provide quality and efficiency in highly dynamic scenarios.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.