Abstract

As the demand for highly parallel systems grows, the vast amount of concurrently operating hardware involved can make it difficult to guarantee proper system behavior. Problems arise both from permanent and transient hardware faults and from errors caused by improper programming. A number of fault tolerance solutions have emerged. Following a survey of fault tolerance in arrays, a discussion of solutions for more specialized architectures is presented.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call