Abstract

Deep Learning techniques have been successfully used to solve a wide range of computer vision problems. Due to their high computation complexity, specialized hardware accelerators are being proposed to achieve high performance and efficiency for deep learning-based algorithms. However, soft errors, i.e., bit flipping errors in the layer output, are often caused due to process variation and high energy particles in these hardware systems. These can significantly reduce model accuracy. To remedy this problem, we propose new algorithms that effectively reduce the impact of errors, thus keeping high accuracy. We firstly propose to incorporate an Error Correction Layer (ECL) into neural networks where convolution is performed multiple times in each layer and majority reporting is conducted for the outputs at bit level. We found that ECL can eliminate most errors while bypassing the bit-error when the bits at the same position are corrupted multiple times under the simulated condition. In order to solve this problem, we analyze the impact of errors depending on the position of bits, thus observing that errors in most significant bit (MSB) positions tend to severely corrupt the output of the network compared to the errors in the least significant bit (LSB) positions. According to this observation, we propose a new specialized activation function, called Piece-wise Rectified Linear Unit (PwReLU), which selectively suppresses errors depending on the bit positions, resulting in an increased model resistance against the errors. Compared to existing activation functions, the proposed PwReLU outperforms with large accuracy margins of up-to 20% even with very high bit error rates (BERs). Our extensive experiments show that the proposed ECL and PwReLU work in a complementary manner, achieving comparable accuracy to the error-free networks even at a severe BER of 0.1% on CIFAR10, CIFAR100, and ImageNet.

Highlights

  • Recent deep neural networks (DNNs) have shown promising performance in various areas, their huge computational complexity and large power consumption have impeded deployment of practical and/or real-time DNN applications

  • We present n-variants of Piece-wise Rectified Linear Unit (PwReLU) depending upon the number of thresholds, as denoted by PwReLUn

  • Recent specialized hardware systems are highly parallel architectures that are ideal for the implementation of DNNs; the soft errors occurring due to their process variation and complex circuitry render them useless

Read more

Summary

Introduction

Recent deep neural networks (DNNs) have shown promising performance in various areas, their huge computational complexity and large power consumption have impeded deployment of practical and/or real-time DNN applications. In the pursuit of making DNNs more efficient, rigorous efforts from four different perspectives. Have been made, i.e., element, circuit, process (hardware implementation), and algorithm perspectives. A. PREVIOUS WORK 1) BUILDING EFFICIENT DNNs AT FOUR LEVELS At the algorithm level, Han et al [18] proposed Deep Compression, containing pruning, quantization and Huffman coding for the weight values of a DNN to reduce computation complexity and memory access. Chollet et al [8] proposed a depth-wise separable convolution layer that decomposes 3-dimensional convolution into spatial and channel.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call