Abstract
This paper examines a software implemented self-checking technique that is capable of detecting processorregisters' hardware-transient faults. The proposed approach is intended to detect run-time transient bit-errors in memory and processor status register. Error correction is not considered here. However, this low-cost approach is intended to be adopted in commodity systems that use ordinary off-the-shelf microprocessors, for the purpose of operational faults detection towards gaining fail-safe kind of fault tolerant system.
Highlights
The objective of this proposed software based self-checking technique is to detect multiple transient bit errors in processor memory and processor status word (PSW)
Algorithm Based Fault Tolerance (ABFT) is suited for applications that use regular structures and its applicability is valid for a limited set of problems
Transient bit errors at the NOP codes that might have occurred prior to executing this code are detected by this low cost software fix
Summary
The objective of this proposed software based self-checking technique is to detect multiple transient bit errors (soft errors) in processor memory and processor status word (PSW). A disagreement between the PSWs is observed, programmer can bring the program control to a safe and known stable state (through error flags) in order to re-start the application for gaining failsafe type of fault tolerance This approach has the fault coverage limiting over the random and multiple bit errors in the memory area containing NOP codes, PUSH or POP PSW instruction codes, or in memory stack or in the processor status register containing the PSW bit-pattern and over a limited program control errors. The effectiveness of the proposed transient errors detection scheme is verified on the microprocessor based system through debugging the manually modified (random single-bit flip) source program This proposed approach is useful for detecting the various kind of faults in a faulty processor that generates wrong and random answers (i.e., Byzantine faults). The proposed low-cost softwarefix scheme is intended to detect run-time multiple transient errors at a part of the memory space and processor registers
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have