Abstract

To recognize transient control-flow and data faults, caused by Single-Event Upsets (SEUs) in a microprocessor pipeline, several mechanisms to check the execution in the retirement have been proposed and discussed over the years. In this paper, we suggest a compression-based and compression-free checksum-scheme, which is able to recognize transient faults before commitment and preserves binary compatibility. The scheme is applicable for time-redundant (virtual duplex and redundantly multithreaded systems) as well as structural redundant systems. It can localize a fault by partial re-execution within the pipeline. By additionally introducing a modified micro-rollback, single or multiple pipeline stages can be rolled back for a retry. In the best case, a fault can be localized, detected and corrected in four clock cycles within a fine-grained redundantly threaded microprocessor. We validate and analyze the scheme through an FPGA and standard-cell implementation and conclude that it is able to replace the well-known parity-computation for high-performance designs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call