Abstract

This article presents a distributed directory based cache coherence protocol that improves performance and facilitates error recovery in large scale multiprocessors. A number of distributed directory based protocols, such as the Scalable Coherent Interface (SCI, ANSI/IEEE Std 1596), use a linked list structure to maintain cache coherence. While they work well for small to medium size systems, the list traversal overhead becomes high when the system size grows into the thousands of processors range. Also, the system is vulnerable to a single node failure in that the recovery from such a failure involves all the processors in the system. Single node failure can happen relatively frequently when a protocol is applied to SCI-based Local Area MultiProcessors (LAMP) where individual nodes are autonomous computers and can power up and down individually. We propose an enhancement to the linked list approach. A redundant spanning list is constructed when the list is built, which achieves two goals: 1) the list traversal time is reduced from O(N) to O(/spl radic/N) and 2) recovery from single node failure is confined to the processors involved in the failed list, unless the head of the list is lost.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.