Abstract
Wide-area Virtual-Machine (VM) live migration can serve as a disaster-recovery solution for IT services by moving virtualized servers to safe locations upon a critical disaster. In this scenario, it is desirable to evacuate as many VMs as possible under limited and changing electrical power and network conditions. The challenges are 1) when migrating many VMs simultaneously, the migration time of each individual VM increases, resulting in high probability of migration failures due to power or network failures, 2) the sequential migration of VMs may not efficiently use the network, and 3) network conditions, such as available bandwidth and congestion, fluctuate over time. There is a need to solve a multi-objective problem that aims at reducing simultaneously the total migration time and individual migration times. In this paper, we focus on precopy migration and present 1) the design and implementation of a feedback-based control system that manages VM migrations of multiple servers and tackles the aforementioned challenges, 2) valuable findings from extensive experiments and 3) a metric to evaluate the migration performance that takes into account both the total and individual migration times. The proposed system monitors the network usage of hosts, adjusts migration parameters, and coordinates the migration scheduling of VMs. It is a promising approach to efficiently transfer IT services from a damaged data enter to a fully functional one by automatically managing migrations across data enters. Experiments are conducted with several combinations of parameters including network conditions, migration strategies, controller type, memory distribution, and live/offline VM migrations. The results show 1) the usefulness of a feedback-based controller with a global view that can coordinate multiple physical machines to efficiently use network resources and reduce migration times, 2) the factors that affect the migration performance of multiple hosts, 3) the potential of improving sequential VM migration by integrating support for parallel TCP connections, and 4) near-optimal operating point is found while balancing both the total migration time and individual migration times by using the proposed control system.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.