Analysis of checkpointing schemes for multiprocessor systems

A Ziv,J Bruck

doi:10.1109/reldis.1994.336909

Abstract

Parallel computing systems provide hardware redundancy that helps to achieve low cost fault-tolerance, by duplicating the task into more than a single processor, and comparing the states of the processors at checkpoints. This paper suggests a novel technique, based on a Markov reward model (MRM), for analyzing the performance of checkpointing schemes with task duplication. We show how this technique can be used to derive the average execution time of a task and other important parameters related to the performance of checkpointing schemes. Our analytical results match well the values we obtained using a simulation program. We compare the average task execution time and total work of four checkpointing schemes, and show that generally increasing the number of processors reduces the average execution time, but increases the total work done by the processors. However, in cases where there is a big difference between the time it takes to perform different operations, those results can change. >

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Analysis of checkpointing schemes for multiprocessor systems

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Determining average program execution times and their variance
V Sarkar
-
V SarkarV Sarkar
21 Jun 1989
21 Jun 1989

Determining average program execution times and their variance
V Sarkar
ACM SIGPLAN Notices | VOL. 24
V SarkarV Sarkar
21 Jun 1989
ACM SIGPLAN Notices | VOL. 24

Successive-Cancellation Flip Decoding of Polar Codes with a Simplified Restart Mechanism
Ilshat Sagitov ... Pascal Giard
-
Ilshat Sagitov, et. al.Ilshat Sagitov ... Pascal Giard
01 Mar 2023
01 Mar 2023

Performability Comparison of Lustre and HDFS for MR Applications
Rekha Singhal ... Harish Sukhwani
-
Rekha Singhal, et. al.Rekha Singhal ... Harish Sukhwani
01 Nov 2014
01 Nov 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analysis of checkpointing schemes for multiprocessor systems

Abstract

Talk to us

Similar Papers