Abstract

Fault tolerance is the most imperious issue in the cloud to provide reliable services. Inherent vulnerability to failure hampers the performance and reliability of cloud services. Hence, to achieve reliability, fault tolerance becomes a mandatory feature which is hard to implement due to the dynamic infrastructure and complex interdependencies. Numerous fault tolerance techniques have been developed in the literature to address the challenges of cloud reliability. A recent research survey presented in this paper attempts to integrate the different fault tolerance architecture. This study presents a critical research review on various existing fault tolerance techniques to improve services reliability, availability, and applications execution in the cloud. A comparative analysis, based on different critical metrics like failure prediction, detection strategy, failure history, VM placement, and limitations, of the reviewed framework systems is also included in the paper. This review intends to facilitate the development of the new fault tolerance technique for the cloud environment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call