Abstract
Aggressive scaling in deep nanometer technology enables chip multiprocessor design facilitated by the communication-centric architecture provided by Network-on-Chip (NoC). At the same time, it brings considerable challenges in reliability because a fault in the network architecture severely impacts the performance of a system. To deal with these reliability challenges, this research proposed NoCGuard, a reconfigurable architecture designed to tolerate multiple permanent faults in each pipeline stage of the generic router. NoCGuard router architecture uses four highly reliable and low-cost fault-tolerant strategies. We exploited resource borrowing and double routing strategy for the routing computation stage, default winner strategy for the virtual channel allocation stage, runtime arbiter selection and default winner strategy for the switch allocation stage and multiple secondary bypass paths strategy for the crossbar stage. Unlike existing reliable router architectures, our architecture features less redundancy, more fault tolerance, and high reliability. Reliability comparison using Mean Time to Failure (MTTF) metric shows 5.53-time improvement in a lifetime and using Silicon Protection Factor (SPF), 22-time improvement, which is better than state-of-the-art reliable router architectures. Synthesis results using 15 nm and 45 nm technology library show that additional circuitry incurs an area overhead of 28.7% and 28% respectively. Latency analysis using synthetic, PARSEC and SPLASH-2 traffic shows minor increase in performance by 3.41%, 12% and 15% respectively while providing high reliability.
Highlights
A conventional way to increase chip performance is to improve its operational frequency.the power consumption of a chip shares a linear relationship with its operating frequency.It forces the designers to search for other ways to increase performance without exponentially increasing power consumption
This led to the design of chip multi-processors (CMP) or multi-core architectures with high performance and low power consumption [2]
To facilitate fault tolerance at virtual channel allocation (VA), we propose to add two registers per 20:1 arbiter as shown in IDVC (Identification of the Virtual Channel), holds the identification of the default winner virtual channel (VC) and is
Summary
A conventional way to increase chip performance is to improve its operational frequency. Aggressive technology scaling in a deep nanometer regime enables the fabrication of billions of transistors on a chip [1] This led to the design of chip multi-processors (CMP) or multi-core architectures with high performance and low power consumption [2]. NoC architecture is a packet-based inter-connected network that separates communication from the computation As it is different from the shared bus, it facilitates customization in terms of bandwidth, buffers size, and topology. We work on the permanent fault tolerance mechanism for each pipeline stage of the router It ensures connectivity of the healthy core associated with the faulty router. The rest of the paper is organized as follows; Section 2 presents the overview of existing reliable router architectures and fault detection mechanisms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.