Bactran: A Hardware Batch Normalization Implementation for CNN Training Engine

Yang Zhijie,Guo Shasha,Wang Shuquan,Luo Li,Wang Lei,Li Shiming

doi:10.1109/les.2020.2975055

Abstract

In recent years, convolutional neural networks (CNNs) have been widely used. However, their ever-increasing amount of parameters makes it challenging to train them with the GPUs, which is time and energy expensive. This has prompted researchers to turn their attention to training on more energy-efficient hardware. batch normalization (BN) layer has been widely used in various state-of-the-art CNNs for it is an indispensable layer in the acceleration of CNN training. As the amount of computation of the convolutional layer declines, its importance continues to increase. However, the traditional CNN training accelerators do not pay attention to the efficient hardware implementation of the BN layer. In this letter, we design an efficient CNN training architecture by using the systolic array. The processing element of the systolic array can support the BN functions both in the training process and the inference process. The BN function implemented is an improved, hardware-friendly BN algorithm, range batch normalization (RBN). The experimental results show that the implementation of RBN saves 10% hardware resources, reduces the power by 10.1%, and the delay by 4.6% on average. We implement the accelerator on the field programmable gate array VU440, and the power consumption of the its core computing engine is 8.9 W.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bactran: A Hardware Batch Normalization Implementation for CNN Training Engine

Abstract

Talk to us

Similar Papers

More From: IEEE Embedded Systems Letters

Lead the way for us

Journal: IEEE Embedded Systems Letters	Publication Date: Feb 28, 2020
Citations: 27

Similar Papers

PRBN: A Pipelined Implementation of RBN for CNN Training
Zhijie Yang ... Li Luo
-
Zhijie Yang, et. al.Zhijie Yang ... Li Luo
01 Jan 2020
01 Jan 2020

Batch Normalization Processor Design for Convolution Neural Network Training and Inference
Yu-Sheng Ting ... Yu-Fan Teng
-
Yu-Sheng Ting, et. al.Yu-Sheng Ting ... Yu-Fan Teng
01 May 2021
01 May 2021

Minimizing Off-Chip Memory Access for Deep Convolutional Neural Network Training
Jijun Wang ... Hongliang Li
-
Jijun Wang, et. al.Jijun Wang ... Hongliang Li
01 Jan 2020
01 Jan 2020

An Empirical Study on Position of the Batch Normalization Layer in Convolutional Neural Networks
Moein Hasani ... Hassan Khotanlou
-
Moein Hasani, et. al.Moein Hasani ... Hassan Khotanlou
01 Dec 2019
01 Dec 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bactran: A Hardware Batch Normalization Implementation for CNN Training Engine

Abstract

Talk to us

Similar Papers

More From: IEEE Embedded Systems Letters