Abstract

As semiconductor technologies scale down to deep sub-micron dimensions, transient faults will soon become a critical reliability concern. Due to their prohibitive costs, traditional high-end solutions are unacceptable for the mainstream commodity market. This paper presents FTPIPE, a hybrid software/hardware solution, which provides sufficient fault coverage with affordable overhead for single-threaded programs running on commodity systems. Leveraging existing exception mechanisms with minor modifications to handle exception-causing faults, FTPIPE focuses on tolerating silent data corruptions by using compile-time analysis and performing selective instruction replication in a modern superscalar pipeline extended with minimal hardware overhead. Unlike existing instruction replication-based solutions, which detect faults by synchronous checks, the FTPIPE platform has exploited a novel hybrid synchronous/asynchronous check method for the replicated instructions. In this manner, better performance can be obtained without degradation of fault coverage. By synchronous checks, the validation of the result of a replicated instruction must be finished before it is committed, whereas such a guarantee is not required by an asynchronous check. Evaluation using a set of nine programs from the Mibench benchmark suite demonstrates that FTPIPE can tolerate 89.8% of transient faults under a modest performance overhead of 20.1%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.