Abstract
Advances in technology scaling, coupled with aggressive voltage scaling results in significant reliability challenges for emerging Chip Multiprocessor (CMP) platforms, where error-prone caches continue to dominate the chip area. Network-on-Chip (NoC) fabrics are increasingly used to manage the scalability of these CMPs. We present a novel fault-tolerant scheme for Last Level Cache (LLC) in CMP architectures that leverages the interconnection network to protect the LLC cache banks against permanent faults. During a LLC access to a faulty area, the network detects and corrects the faults, returning the fault-free data to the requesting core. By leveraging the NoC interconnection fabric, we can implement any cache fault-tolerant scheme in an efficient, modular, and scalable manner. We perform extensive design space exploration on NoC benchmarks to demonstrate the utility and efficacy of our approach. The overheads of leveraging the NoC fabric are minimal: on an 8-core, 16-cache-bank CMP we demonstrate reliable access to LLCs with additional overheads of less than 3% in area and less than 7% in power.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.