The rapid development of processor manufacturing technologies has encouraged most designers to highlight the processors' resilience to errors. Regarding embedded systems in harsh environments, temporary faults, such as single-event upsets (SEUs) or single-event multiple upsets (SEMUs), induced by radiation effects on a memory cell or a combinational logic circuit, can cause perturbations in the system's behavior, which in turn can cause catastrophic consequences. In this context, Fault Injection (FI) has been successfully applied as a mature method to assess reliability against faults and reveal the system's deficiencies to be protected cost-effectively. Although high-level software FI techniques are less accurate, the high accuracy of fault injection at a low level usually comes at the expense of desirable characteristics related to portability, flexibility, and intrusiveness. Hence, this paper proposes a Configurable location-Aware Fault Injection (CAFI) technique. CAFI is a software-based technique that facilitates emulating different kinds of transient hardware faults, e.g., SEUs and SEMUs, at the software level. Therefore, it supports flexibility in conducting reliability assessment studies with varying fault models. It operates at the binary code level and exploits a timing-based mechanism to inject faults at run-time in a negligible intrusive manner. CAFI can inject faults at different granularity levels to maximize fault activation. In detail, a fine-grained into a specific instruction field and a coarse-grained into the whole system's software. CAFI requires negligible modifications to the target software under test and allows it to run at near-native speed. The effectiveness of CAFI is evaluated by conducting many fault injection experiments applying to different real-world benchmark programs. Moreover, the accuracy of CAFI is quantified with respect to high-level software fault injection. For this aim, a practical prototype of CAFI is implemented on the x86 architecture employing an Intel core i7 processor with 16GB RAM. Based on a total of 108,000 fault injection experiments, the rate of program crashes caused by CAFI is slightly more than 18% higher than that caused by high-level software fault injection. On the other hand, there are no significant differences between the silent data corruptions (SDCs), results obtained by CAFI, and high-level software fault injections, making CAFI applicable for studying crash- and SDC-causing errors.
Read full abstract