Abstract
We show how a commercial OS can be successfully recovered from a crash.Support from the virtualization layer (Hypervisor) can significantly help in diagnosis and recovery of the OS.We evaluate the time taken to automatically recover from an OS crash for different workloads.This technology can significantly reduce the downtime and maintenance costs in data centers.This technology can be easily integrated into the support operations of OS vendors. Many OS crashes are caused by bugs in kernel extensions or device drivers while the OS itself may have been tested rigorously. To make an OS immortal we must resurrect the OS from these crashes. We present a novel OS-hypervisor infrastructure that allows automated and transparent OS crash diagnosis and recovery in a virtual environment. This infrastructure eliminates the need for reboots or checkpoint-restart mechanisms, which require preserving the states of critical applications before the crash happens and also require extensive modifications to those applications. At the core of our approach is a small hidden OS-repair-image that is dynamically created from the healthy running OS instance. When an OS crashes, the hypervisor dynamically loads this repair-image to perform diagnosis and repair. One way of repair we have experimented with, is to quarantine the offending process and resume the running of the fixed OS automatically without a reboot. Experimental evaluations demonstrated that it takes less than 3s to recover from an OS crash. This approach can significantly reduce the downtime and maintenance costs in data centers, and is the first design and implementation of an OS-hypervisor combo capable of automatically resurrecting a crashed commercial server-OS. In addition to online diagnosis and recovery, this infrastructure can also be used for offline diagnosis and can be incorporated into the technical support tools of the OS vendor. Additionally, we have used parts of this infrastructure to speed-up the diagnosis of AIX OS-crashes for the IBM technical support teams.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.