Abstract
Technology evolution has raised serious reliability considerations, as transistor dimensions shrink and modern microprocessors become denser and more vulnerable to faults. Reliability studies have proposed a plethora of methodologies for assessing system vulnerability which, however, highly rely on traditional reliability metrics that solely express failure rate over time. Although Failures In Time (FIT) is a very strong and representative reliability metric, it may fail to offer an objective comparison of highly diverse systems, such as CPUs against GPUs or other accelerators that are often employed to execute the same algorithms implemented for these platforms. In this paper, we propose a reliability evaluation methodology that takes into account the probability of a workload execution failure in order to compare heterogeneous systems, while we also capture the differences in the performance of these systems. We demonstrate the usefulness of the methodology with a test case scenario that compares the reliability and performance of three different commercial CPUs (different ISAs and microarchitectures) and one GPU. We use statistical fault injection to assess the vulnerability of the register file for the four computing systems of our study. The evaluation was performed using a comprehensive set of benchmarks with the same algorithms implemented for each individual system (serial code for the CPUs and parallel code for the GPU). Our findings show that, even though the GPU proves to be three orders of magnitude more vulnerable than CPUs using traditional reliability metrics, our performance-aware evaluation methodology shrinks this gap by 1-2 orders of magnitude providing more informative and realistic measurements to guide designers or programmers decisions.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.