Abstract

The reliability and availability of cloud systems have become major concerns of service providers, brokers, and end-users. Therefore, studying fault-tolerance mechanisms in cloud computing attracts intense attention in industry and academia. The task-scheduling mechanisms can improve the fault-tolerance level of cloud systems. A task-scheduling mechanism distributes tasks to a group of instances to be executed. Much work has been undertaken in this direction to improve the overall outcome of cloud computing, such as improving service quality and reducing power consumption. However, little work on task scheduling has studied the problem of lost tasks from the broker’s perspective. Task loss can happen due to virtual machine failures, server crashes, connection interruption, etc. The broker-based concept means that the backup task can be allocated by the broker on the same cloud service provider (CSP) or a different CSP to reduce costs, for example. This paper proposes a novel fault-tolerant mechanism that employs the primary backup (PB) model of task scheduling to address this issue. The proposed mechanism minimizes the impact of failure events by reducing the number of lost tasks. The mechanism is further improved to shorten the makespan time of submitted tasks in cloud systems. The experiments demonstrated that the proposed mechanism decreased the number of lost tasks by about 13%–15% compared with other mechanisms in the literature.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call