Abstract

As semiconductor processing techniques continue to scale down, transient faults, also known as soft errors, are increasingly becoming a reliability threat to high-performance microprocessors fabricated using state-of-the-art CMOS technologies. Emerging 3D chip integration techniques leverage vertically stacked structures to reduce on-chip wire delay and have shown the capability of overcoming interconnect bottlenecks as well as reducing power consumption. While the benefits of 3D die stacking on microprocessor performance and power have been extensively investigated recently, its implication on transient fault susceptibility is largely unknown. In this work, we make the first attempt to characterize microarchitecture soft error vulnerabilities across the stacked chip layers under 3D integration technologies. Using models and simulations that capture soft error physical mechanism and circuit/architecture level impact, our study reveals the opportunities of leveraging 3D integration (e.g. the structure of vertical stacking and the incorporation of heterogeneous process technologies) to achieve enhanced reliability. We showcase that the first characteristic allows outer-layers to shield inter-layers from particle strikes and the second feature enables the deployment of error resilience device techniques (e.g. Silicon-On-Insulator) on vulnerable layers to achieve a reliability target while minimizing manufacturing cost. We further propose a set of microarchitecture techniques which can effectively exploit the reliability benefits offered by 3D technologies. For example, we propose the scheduling of vulnerable in-flight instructions to reliable layers and design robust register files by combing reliability-hardened circuits, program value vulnerability and 3D integration techniques. Experimental results show that these techniques are able to substantially reduce 3D microarchitectures’ soft error rate by up to 88% compared to a planar design. We further evaluate the thermal implication of the proposed techniques and conclude that their impact on chip temperature is negligible.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.