Abstract

Existing fault tolerance approaches in the cloud are broadly based on replication and checkpointing. Each of these approaches has its advantages and limitations. This paper presents an adaptable fault tolerance method for determining which of the two approaches will be appropriate for the successful execution of a task in the given cloud conditions. The proposed method classifies the failure risk of host machines available for task execution based on their failure history. Subsequently, fuzzy logic is used to determine the appropriate fault tolerance approach by considering a host's failure risk, user-defined task's priority, and level of resource redundancy. Setting a task's priority provides a user with control to solicit a desired fault tolerance level while the availability of resources reflects a cloud provider's capability to offer fault tolerance. Simulation experiments have verified that the proactive selection of a fault-tolerance method increases the number of tasks that complete successfully.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call