Abstract

SFT algorithm, a consistent checkpointing algorithm with shorter freezing time, is presented in this paper. SFT is able to implement fault-tolerance in distributed systems. The features of the algorithm include shorter freezing time, lower overhead, and simple roll backing. To reduce checkpointing time, a special control message (Munblock) is used to ensure that at any given time a process can respond the checkpoint event quickly. Moreover, a main memory algorithm is used to improve concurrency of checkpointing. By using SFT algorithm, the freezing time resulted by checkpointing is less than 0.03s. The control message number of SFT is only O (n).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call