Abstract

The current multiprocessors such as Cray T3D support interprocessor communication using partitioned dimension-order routers (PDRs). In a PDR implementation, the routing logic and switching hardware is partitioned into multiple modules, with each module suitable for implementation as a chip. This paper proposes a method to incorporate fault-tolerance into such routers with simple changes to the router structure and logic. The previously known fault-tolerant routing methods assume centralized crossbar based routers and are not applicable to multiprocessors with PDRs. The proposed technique works for convex fault model, using only local knowledge of faults. Using the proposed techniques and as few as four virtual channels per physical channel, torus networks with PDRs can handle faults without compromising deadlock- and livelock-freedom. Simulations for 2-dimensional torus and mesh networks show that the resulting fault-tolerant PDRs have performances similar to those of the crossbar based routers.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.