Abstract

In a cloud computing environment, various hardware and software services are provided to the users across multiple servers and data centers. These servers are communicated to each other to allow greater scalability, flexibility, and reliability. Reliability is a vital factor in cloud computing that ensures that the requested services will be delivered to the users whenever they request them. However, different hardware or software faults may occur in cloud servers or data centers that prevent the users from receiving the service. Fault tolerance is defined as the ability of the system to provide services to the users even with the presence of faults or failures. In this review, we focused on some of the emerging fault tolerance techniques researchers have proposed to tackle the fault issues in cloud computing. We divided these techniques into three main categories: proactive and reactive techniques. Proactive techniques involve protecting the system defects by proposing certain procedures to prevent reaching the defective condition. Reactive techniques refer to the ability of the cloud system to recover the defective server or framework to continue working and providing the service.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call