Abstract

While various checkpointing schemes have been widely used to reduce the recovery time when a fault occurs, the problem of evaluating the optimal checkpoint interval that maximizes the availability of the system has been a critical research issue for decades. The evaluation can be done by developing analytical models with restrict assumptions. However, the analytical model has reached its limitations as the checkpointing schemes become complicated. This paper proposes to use stochastic Petri net model for the evaluation and shows the effectiveness of the approach using case studies. The paper develops stochastic Petri net models and shows how to obtain the optimal checkpoint intervals for systems employing two widely used checkpointing schemes: Checkpoint with Rollback Recovery scheme for uniprocessor systems and Primary Site Approach for multiprocessor systems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call