Abstract

The exponentially increasing occurrence of soft errors makes the optimization of reliability, performance, hardware area, and power consumption one of the main concerns in modern embedded processors. Since the design cost of hardware techniques aimed at improving the reliability of microprocessors is quite expensive for resource-constrained embedded systems, software-level fault tolerance mechanisms have been proposed as an attractive solution for soft error threats. However, many software-level redundancy-based schemes are accompanied by considerable performance overhead, which is not acceptable for many embedded applications. In this work, we have introduced an ultra-low-cost soft error protection scheme for embedded applications, which works based on source-code analysis and identifying critical variables. After identification, these vital variables are adequately protected by placing runtime checks at critical points of execution. Our experimental results based on several applications demonstrate that the proposed scheme can mitigate the failure rate by 47% with negligible performance degradation.

Highlights

  • One of the primary sources of unreliability in modern processors is transient faults or soft errors

  • To the software level [2] in contemporary microprocessors, it has been projected that the number of transient faults that can bypass the masking walls and change the final program output will increase by 30× as technology scales down from 45 to 16 nm technology [3,4]

  • In order to mitigate the problem of soft errors, researchers have proposed fault-tolerant mechanisms operating at several levels, including circuit level [5,6], microarchitectural level [2], and software level [7,8,9,10,11,12,13,14]

Read more

Summary

Introduction

One of the primary sources of unreliability in modern processors is transient faults or soft errors. In order to mitigate the problem of soft errors, researchers have proposed fault-tolerant mechanisms operating at several levels, including circuit level [5,6], microarchitectural level [2], and software level [7,8,9,10,11,12,13,14]. Soft errors or transient faults are one of the main reasons that can cause hardware malfunctions and jeopardize the functional safety of an application. Background radiation such as high-energy neutrons and protons are considered as the primary source of soft errors. Soft errors were considered as a reliability challenge for high-altitude

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call