Abstract

A fault-tolerant mutual exclusion algorithm for distributed systems is presented. The algorithm uses a distributed queue strategy and maintains alternative paths at each site to provide a high degree of fault tolerance. However, owing to these alternative paths, the algorithm must use reverse messages to avoid the occurrence of directed cycles, which may form when the direction of edges is reversed after the token passes through. If there is no alternative path, the total number of the messages exchanged is O (2*log N) in light traffic and two messages in heavy traffic; however, in this case the system cannot tolerate even a single communication link or site failure. If there are alternative paths between sites, the system can achieve a higher degree of fault tolerance at the expense of increased message traffic (owing to reverse messages). Thus, there is a tradeoff between efficiency and reliability, and a system can be designed to balance these two criteria properly. A recovery procedure for restoring a recovering site consistently into the system is also presented. >

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call