Rolling Upgrade with Dynamic Batch Size for IaaS Cloud

Mina Nabi,Ferhat Khendek,Maria Toeroe

doi:10.1109/cloud.2016.0072

Abstract

Cloud systems are upgraded regularly to improve their performance, fix bugs, deploy new software versions, etc. Their hosted services, for example, telecommunication services, may have stringent non-functional requirements such that they do not tolerate more than five minutes of downtime in a year regardless whether they are provided by a cloud system or the outage is due to an upgrade. Such services must remain highly available under any circumstance, which imposes availability requirements on the provider cloud system. The dynamicity of the environment is one of the main challenges for maintaining High Availability (HA) in cloud deployments during upgrades. To maintain availability during upgrades, most cloud providers use rolling upgrade, and to avoid any unexpected interference between the upgrade process and the cloud's scaling mechanism, scaling is disabled for the time of the upgrade. In this paper, we propose a novel approach for rolling upgrades applicable to – among others – IaaS cloud systems to address HA. This approach mitigates the interference between the upgrade process, any failure handling and scaling by dynamically adjusting the upgrade process to the changes in the cloud environment. Accordingly, the upgrade process can start/resume only when the system has sufficient resources to perform an upgrade iteration and suspends the process when this is not the case. As a result, scaling does not need to be disabled during upgrades, rather the scaling operations regulate the pace of the upgrade.

Full Text