Live migration of virtual machines (VM) across physical hosts provides a significant new benefit for administrators of data centers and clusters. Previous memory-to-memory approaches demonstrate the effectiveness of live VM migration in local area networks (LAN), but they would cause a long period of downtime in a wide area network (WAN) environment. This paper describes the design and implementation of a novel approach, namely, CR/TR-Motion, which adopts checkpointing/recovery and trace/replay technologies to provide fast, transparent VM migration for both LAN and WAN environments. With execution trace logged on the source host, a synchronization algorithm is performed to orchestrate the running source and target VMs until they reach a consistent state. CR/TR-Motion can greatly reduce the migration downtime and network bandwidth consumption. Experimental results show that the approach can drastically reduce migration overheads compared with memory-to-memory approach in a LAN: up to 72.4 percent on application observed downtime, up to 31.5 percent on total migration time, and up to 95.9 percent on the data to synchronize the VM state. The application performance overhead due to migration is kept within 8.54 percent on average. The results also show that for a variety of workloads migrated across WANs, the migration downtime is less than 300 milliseconds.
Read full abstract