Abstract
Networks-on-Chip (NoCs) are promising fabrics to provide scalable and efficient on-chip communication for large-scale many-core systems. In place of the well-studied synchronous NoCs, the event-driven asynchronous ones have emerged as promising replacement thanks to their strong timing robustness especially when implemented in quasi-delay-insensitive (QDI) circuits. However, their fault tolerance has rarely been studied. The QDI NoCs show complicated failure scenarios and behave differently from synchronous ones. As the scaling semiconductor technology is expected with the accelerated aging process, permanent faults become more likely to happen at runtime. These faults can break the handshake, leading to physical-layer deadlocks which can spread and paralyze the whole QDI NoC. This physical-layer deadlock cannot be resolved using conventional fault-tolerant or deadlock management techniques. This paper systematically studies the impact of permanent faults on QDI NoCs, and presents novel deadlock detection and recovery techniques to handle the fault-caused physical-layer deadlock. The proposed detection technique has been implemented to protect the NoC data paths that occupy ~90% of the logic. Employing the detection and recovery techniques to protect interrouter links (~60% of the logic), a permanently faulty link is precisely located and the network function can be recovered with graceful performance degradation.
Highlights
N ETWORKS-ON-CHIP (NoCs) are a promising infrastructure to support on-chip communication of large-scale multicore systems due to their efficiency and Manuscript received November 13, 2016; revised March 4, 2017 and May 31, 2017; accepted July 5, 2017
Synchronous NoCs need to distribute the global clock with little skew over long distances, which may cross multiple timing domains belonging to different intellectual property (IP) cores
In a pure QDI NoC studied in this paper, this partial data will propagate to all downstream stages as long as they are ready (Section III), which will cause multiple deadlocks reported along the deadlocked packet path if their technique is used, failing to locate the fault position
Summary
N ETWORKS-ON-CHIP (NoCs) are a promising infrastructure to support on-chip communication of large-scale multicore systems due to their efficiency and Manuscript received November 13, 2016; revised March 4, 2017 and May 31, 2017; accepted July 5, 2017. They can halt the handshaking process, resulting in physical-layer deadlocks These deadlocks are different from network-layer ones caused by the cyclic dependence of packets [13]. ZHANG et al.: HANDLING PHYSICAL-LAYER DEADLOCK CAUSED BY PERMANENT FAULTS IN QDI NoCs work without locating and isolating the faulty component. Handling runtime permanent faults on QDI NoCs in such a deadlock state is more difficult than on synchronous NoCs. In the era of deep submicrometer when reliability becomes one of the critical challenges for digital systems [14], it is important to keep specific, critical or ultraexpensive systems working even with some performance loss, proposing a demand for permanent-fault-tolerant QDI NoCs. This paper handles physical-layer deadlocks caused by permanent faults on QDI NoCs. Its contribution includes the following. 4) For intermittent faults (early symptom of permanent faults) that are long enough to cause a deadlock, the recovery mechanism automatically resumes the isolated pipeline stages once the fault disappears
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.