Abstract

This paper evaluates the reliability of a GPU pipeline upset by SEU faults and the impacts of software-based and hardware-based fault tolerance techniques. The approach entails first assessing the vulnerability of the GPU pipeline to SEU through a fault injection campaign at register transfer level. Second, this assessment applies three low-level software-based fault tolerance techniques to protect the register files intending to indirectly protect the pipeline and evaluates impacts on performance degradation. Thirdly, it evaluates the costs of applying generic hardware-based redundancy fault tolerance techniques and their impacts on area overhead. Experiments are performed using a GPU based on the NVIDIA G80 architecture running four case-study applications. Results show fault detection rates of 60% with performance costs of 78%, and up to 100% with less than 60% area overhead.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.