Abstract

The shrinking processor feature and operating voltages of processor circuits are making them increasingly vulnerable to soft faults, which calls for fault resilience techniques at both the software and hardware levels under the big data context. To assist software developers in writing fault-resilient big data applications, we propose the tool ErrorSight, which helps them to focus their efforts on code regions and data structures that are most vulnerable to soft errors, understand how numerical errors propagate through the program, and apply fault resilience techniques effectively. ErrorSight achieves this through efficient generation of error profiles leveraging the predictive power of the Boosted Regression Tree model. We use four big data kernels to illustrate the modular analysis mechanism of ErrorSight and show its usefulness in the development of numerical fault-resilience in Big Data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call