Low-Cost and Fast Failure Recovery Using In-VM Containers in Clouds

Tomonori Morikawa,Kenichi Kourai

doi:10.1109/dasc/picom/cbdcom/cyberscitech.2019.00112

Abstract

Recently, various services are provided using virtual machines (VMs) in clouds. Therefore, it is necessary to prepare for system failures of VMs, hosts running VMs, and even data centers, e.g., using active/standby clustering. However, a trade-off exists between the maintenance cost for additional VMs and the recovery time in traditional techniques. For example, hot standby can rapidly fail over to the secondary system on a system failure, but the secondary system has to always run the same number of VMs as the primary system. In contrast, cold standby does not need to run VMs until a system failure, but it has to boot VMs on failure recovery. In this paper, we propose VCRecovery, which is the system for achieving both low-cost and fast failure recovery. VCRecovery consolidates services using containers inside VMs (in-VM containers) in the secondary system. For hot standby, it can reduce the maintenance cost by using only a smaller number of VMs in the secondary system. For cold standby, it can reduce the recovery time by quickly booting in-VM containers. If a VM is overloaded after the recovery, VCRecovery can migrate several in-VM containers to other VMs. To synchronize storage between VMs in the primary system and in-VM containers in the secondary system, it efficiently performs minimum file-based synchronization based on software packages. We have implemented VCRecovery using LXD and Zabbix and examined the performance.

Full Text