Abstract

The transient error is the failure of the device due to transient hardware faults caused by high-energy particles like neutron and alpha particle strikes. In this study, the authors propose two schemes of fault-tolerant architecture. The first scheme is a hardware-based solution called REMO that combines the best features of space and time redundancy. REMO provides very high fault coverage with minimum overheads in performance, power and area. The second scheme, REMORA combines the best features of hardware and software approaches of fault tolerance. The persistent issue of unprotected code which exists in software approaches is eliminated in this proposal. Simulation results from a SPEC2006 benchmark suite indicate, REMO incurs an increase in the area of about 6%, power overhead is 9% in spite of redundant execution and a negligible performance penalty during a fault-free run. In REMORA, performance degradation increases to 12%. The code size inflation is close to 12%. This is due to the additional signature instructions inserted into the application program. In this study, the authors have explored the possibility of eliminating this penalty by embedding the signatures in control flow instructions. The power and area overhead of REMORA is on par with REMO.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call