Fundamentals of fault-tolerant distributed computing in asynchronous environments

Felix C Gärtner

doi:10.1145/311531.311532

Fundamentals of fault-tolerant distributed computing in asynchronous environments

Felix C Gärtner

Open Access

https://doi.org/10.1145/311531.311532

Copy DOI

Journal: ACM Computing Surveys	Publication Date: Mar 1, 1999
Citations: 335

Affiliation: Technical University of Darmstadt

#Underlying System Model #Distributed Computing + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Fault tolerance in distributed computing is a wide area with a significant body of literature that is vastly diverse in methodology and terminology. This paper aims at structuring the area and thus guiding readers into this interesting field. We use a formal approach to define important terms like fault, fault tolerance , and redundancy . This leads to four distinct forms of fault tolerance and to two main phases in achieving them: detection and correction . We show that this can help to reveal inherently fundamental structures that contribute to understanding and unifying methods and terminology. By doing this, we survey many existing methodologies and discuss their relations. The underlying system model is the close-to-reality asynchronous message-passing model of distributed computing.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: ACM Computing Surveys

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.