Abstract

Clusters of computers can provide, in aggregate, reliable services despite the failure of individual computers. System-level virtualization is widely used to consolidate the workload of multiple physical systems as multiple virtual machines (VMs) on a single physical computer. A single physical computer thus forms a \fIvirtual cluster\fP of VMs. A key difficulty with virtualization is that the failure of the virtualization infrastructure (VI) often leads to the failure of multiple VMs. This is likely to overload computing resiliency mechanisms, typically designed to tolerate the failure of only a single node at a time. By supporting recovery from failure of key VI components, we have enhanced the resiliency of a VI (Xen), thus enabling the use of existing computing techniques to provide resilient virtual clusters. In the overwhelming majority of cases, these enhancements allow recovery from errors in the VI to be accomplished without the failure of more than a single VM. The resulting resiliency of the virtual cluster is demonstrated by running two existing computing systems while subjecting the VI to injected faults.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.