On the Design of Fault-Tolerant Scheduling Strategies Using Primary-Backup Approach for Computational Grids with Low Replication Costs

Qin Zheng,Chen-Khong Tham,Bharadwaj Veeravalli

doi:10.1109/tc.2008.172

Abstract

Fault-tolerant scheduling is an imperative step for large-scale computational grid systems, as often geographically distributed nodes co-operate to execute a task. By and large, primary-backup approach is a common methodology used for fault tolerance wherein each task has a primary copy and a backup copy on two different processors. In this paper, we identify two cases that may happen when scheduling dependent tasks with primary-backup approach. We derive two important constraints that must be satisfied. Further, we show that these two constraints play a crucial role in limiting the schedulability and overloading efficiency of backups of dependent tasks. We then propose two strategies to improve schedulability and overloading efficiency, respectively. We propose two algorithms (MRC-ECT and MCT-LRC), to schedule backups of independent jobs and dependent jobs, respectively. MRC-ECT is shown to guarantee an optimal backup schedule in terms of replication cost for an independent task, while MCT-LRC can schedule a backup of a dependent task with minimum completion time and less replication cost. We conduct extensive simulation experiments to quantify the performance of the proposed algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

On the Design of Fault-Tolerant Scheduling Strategies Using Primary-Backup Approach for Computational Grids with Low Replication Costs

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computers

Lead the way for us

Journal: IEEE Transactions on Computers	Publication Date: Mar 1, 2009
Citations: 104

Similar Papers

Fault-tolerant scheduling for differentiated classes of tasks with low replication cost in computational grids
Qin Zheng ... Chen-Khong Tham
-
Qin Zheng, et. al.Qin Zheng ... Chen-Khong Tham
25 Jun 2007
25 Jun 2007

On the design of communication-aware fault-tolerant scheduling algorithms for precedence constrained tasks in grid computing systems with dedicated communication devices
Qin Zheng ... Bharadwaj Veeravalli
Journal of Parallel and Distributed Computing | VOL. 69
Qin Zheng, et. al.Qin Zheng ... Bharadwaj Veeravalli
07 Dec 2008
Journal of Parallel and Distributed Computing | VOL. 69

Fault-Tolerant Scheduling of Independent Tasks in Computational Grid
Qin Zheng ... Chen-Khong Tham
-
Qin Zheng, et. al.Qin Zheng ... Chen-Khong Tham
01 Jan 2006
01 Jan 2006

Communication-aware Fault-tolerant Scheduling Strategy for Precedence Constrained Tasks in Heterogeneous Distributed Systems
Weipeng Jing ... Qu Wu
International Journal of Digital Content Technology and its Applications | VOL. 5
Weipeng Jing , et. al.Weipeng Jing ... Qu Wu
30 Jun 2011
International Journal of Digital Content Technology and its Applications | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the Design of Fault-Tolerant Scheduling Strategies Using Primary-Backup Approach for Computational Grids with Low Replication Costs

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computers