Abstract

This work presents a method to reliably perform computations in the presence of both hard faults arising from aggressive technology scaling and design defects from human error. The method is based on the observation that a single Turing-complete instruction can mirror any other instruction's semantics. One such instruction is the subleq instruction, which has been used for instructional purposes in the past. The scope of using such an instruction is far greater than that of instructional purposes, and thus, the authors present its applicability to fault tolerance. In particular, they extend a million-instructions-per-second (MIPS) processor with the ultra-reduced instruction set coprocessor (URISC), which implements the subleq instruction. They use the URISC to execute sequences of subleq to mimic the semantics of instructions that are known to be faulty on the MIPS core after testing. The LLVM compiler back end generates the sequence of subleq for instructions marked as faulty. This presents a hardware-software approach to fault recovery. The authors experimentally evaluate the impact of single-upset faults on the instructions that are rendered faulty, the area overhead of the URISC, and the performance overhead of using the URISC.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call