Controlling the number of active instances in a cloud environment

Diego Goldsztajn,Andres Ferragut,Matthieu Jonckheere,Fernando Paganini

doi:10.1145/3199524.3199529

Abstract

We study a cloud environment in which computing instances may either be reserved in advance, or dynamically spawned to serve a fluctuating or unknown load. We first consider a centralized scheme where a system operator maintains the job queue and controls the spawning of additional capacity; through queueing models and their fluid and diffusion counterparts we explore the tradeoff between queueing delay and the service capacity variability. Secondly, we consider the setting of a dispatcher who must immediately send jobs, with no delay, to decentralized instances, and in addition may summon extra capacity. Here the capacity scaling problem couples with one of load balancing. We show how the popular join-the-idle-queue policy can be combined with an adequate rule for spawning instances, yielding an equilibrium with no queuing delay and controlling service capacity variability; we accommodate as well the case where spawned instances incur startup delay. Finally, we analyze the question of deciding, for a given pricing structure for the cloud service, how many fixed instances should be reserved in advance. The behavior of these policies is illustrated by simulations.

Full Text