Abstract

There has been extensive use of Convolutional Neural Networks (CNNs) in safety-critical applications. Presently, GPUs are the most prominent and dominated DNN accelerators to increase the execution speed of CNN models to improve their performance as well as the Latency. However, GPUs are prone to soft errors. These errors can impact the behaviors of the GPU dramatically. Thus, the generated fault may corrupt data values or logic operations and cause errors, such as Silent Data Corruption (SDC). unfortunately, soft errors propagate from the physical level (GPUs) to the application level (CNN model). This paper analyzes the reliability of the AlexNet model to identify which part of the model more vulnerable to the soft error. To achieve this, we injected the AlexNet run on top of NVIDIA’s GPU, using the SASSIFI fault injector as the major evaluator tool. The experiments demonstrate a high reduction from 9.3 % to 0.00% SDCs errors in STORE and 5.0 % to 0.00% SDCs errors in GPR in Im2col. While Add_bias kernel instructions STORE and GPR the errors reduced from 0.8 % to 0.00% and 1.2 % to 0.1% SDCs error respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call