Abstract

In distributed system fault tolerance is an important issue. Many applications executing in present scenario with several processors have to face with problems related to consistency and availability. Complete process will fail with the failure of a single component. There are many existing approaches which assure reliable execution, are based on fault tolerance mechanisms. We talk about the basic concept of fault tolerance, which is to make a network system tolerant enough to work properly, may be with a little low efficiency, in case of any fault. A good fault tolerant system will avoid further failures. After transient failures main problem is to bring a distributed system to a consistent state. We worked on two parts of this problem by providing a distributed system to create consistent checkpoints as well as replication is focused. We have given an algorithm for replication and implemented it in Java RMI. We have done two things: First the checkpoints are replicated and Second, Servers are replicated on different system using that algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call