A Dynamically Adjusting Gracefully Degrading Link-Level Fault-Tolerant Mechanism for NoCs

Arseniy Vitkovskiy,Chrysostomos Nicopoulos,Vassos Soteriou

doi:10.1109/tcad.2012.2188801

Abstract

The rapid scaling of silicon technology has enabled massive transistor integration densities. Nanometer feature sizes, however, are marred by increasing variability and susceptibility to wear-out. Billion-transistor designs, such as chip multiprocessors (CMPs), are especially vulnerable to defects. CMPs rely on a network-on-chip for all their communication needs. A single link failure within this on-chip fabric can impede, halt, or even deadlock, intertile communication, which can render the entire chip multiprocessor useless. In this paper, we present a technique capable of handling very large numbers of permanent wire failures that occur in parallel links either at manufacture-time or at runtime (dynamically). As opposed to marking an entire parallel link as faulty, whenever some wires fail, the proposed methodology employs these partially-faulty links (PFLs) to continue the transfer of information-albeit at a gracefully degraded mode-in order to maintain network connectivity. Furthermore, the presented technique can designate PFLs as fully-faulty when several wires fail, by utilizing appropriate routing algorithms that bypass nonoperational links, while still maintaining load-balance in the vicinity of PFLs. The proposed scheme employs architectural support within the on-chip routers to detect link failures and enable reconfiguration at the granularity of individual wires. Hardware synthesis confirms the low-cost nature of the proposed architecture, and full-system simulations using both synthetic network traffic and real workloads demonstrate its efficacy.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Dynamically Adjusting Gracefully Degrading Link-Level Fault-Tolerant Mechanism for NoCs

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Lead the way for us

Journal: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems	Publication Date: Aug 1, 2012
Citations: 26

Similar Papers

Performance Analysis of OSPF Routing Protocol Under Single and Multiple Link Failure
Himanshi Saini ... Amit Kumar Garg
-
Himanshi Saini, et. al.Himanshi Saini ... Amit Kumar Garg
01 Jan 2018
01 Jan 2018

A fine-grained link-level fault-tolerant mechanism for networks-on-chip
Arseniy Vitkovskiy ... Chrysostomos Nicopoulos
-
Arseniy Vitkovskiy, et. al.Arseniy Vitkovskiy ... Chrysostomos Nicopoulos
01 Oct 2010
01 Oct 2010

Making IGP Routing Robust to Link Failures
Ashwin Sridharan ... Roch Guérin
-
Ashwin Sridharan, et. al.Ashwin Sridharan ... Roch Guérin
01 Jan 2004
01 Jan 2004

Network coding protection based on p-cycles for mesh networks
Lei Guo ... Xingwei Wang
-
Lei Guo, et. al.Lei Guo ... Xingwei Wang
01 Sep 2010
01 Sep 2010

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Dynamically Adjusting Gracefully Degrading Link-Level Fault-Tolerant Mechanism for NoCs

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems