Acceleration of DNN Backward Propagation by Selective Computation of Gradients

Gunhee Lee,Joonsang Yu,Kiyoung Choi,Namhyung Kim,Sujeong Jo,Hanmin Park

doi:10.1145/3316781.3317755

Abstract

The training process of a deep neural network commonly consists of three phases: forward propagation, backward propagation, and weight update. In this paper, we propose a hardware architecture to accelerate the backward propagation. Our approach applies to neural networks that use rectified linear unit. Considering that the backward propagation results in a zero activation gradient when the corresponding activation is zero, we can safely skip the gradient calculation. Based on this observation, we design an efficient hardware accelerator for training deep neural networks by selectively computing gradients. We show the effectiveness of our approach through experiments with various network models. CCS CONCEPTS• Computer systems organization $\rightarrow$ Neural networks;

Full Text