Abstract
This paper presents a behavioral-based error detection technique called control flow checking by execution tracing (CFCET) to increase concurrent error detection capabilities of commercial off-the-shelf (COTS) processors. This technique traces the program jumps graph (PJG) at run-time and compares it with the reference jumps graph to detect possible violations caused by transient faults. The reference graph is driven by a preprocessor from the source program. The idea behind the CFCET is based on using an external watchdog processor (WDP) and also the internal execution tracing feature available in COTS processors to monitor the addresses of taken branches in a program, externally. This is done without any modification of application programs, thus, the program overhead is zero. This technique is analytically evaluated based on three different fault models. The results show that the error detection coverage varies between 79.74% and 96.43% depending on the different workload programs. The errors are detected with about zero latency. The external hardware overhead is about 3% using the Altera flex 10K30 FPGA and the execution time overhead is between 33.26% and 140.81% for different workload programs. The overheads have been measured experimentally by executing the workloads on a Pentium system.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have