Abstract

Message-passing systems with a communication protocol transparent to the applications typically require message logging to ensure consistency between checkpoints. A periodic independent checkpointing scheme with optimistic logging to reduce performance degradation during normal execution while keeping the recovery cost acceptable is described. Both time and space overhead for message logging can be reduced by detecting messages that need not be logged. A checkpoint space reclamation algorithm is presented to reclaim all checkpoints which are not useful for any possible future recovery. Communication trace-driven simulation for several hypercube programs is used to evaluate the techniques. >

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call