Speculative Backpropagation for CNN Parallel Training

Sangwoo Park,Taeweon Suh

doi:10.1109/access.2020.3040849

Sangwoo Park, Taeweon Suh

Open Access

https://doi.org/10.1109/access.2020.3040849

Copy DOI

Abstract

The parallel learning in neural networks can greatly shorten the training time. Its prior efforts were mostly limited to distributing inputs to multiple computing engines. It is because the gradient descent algorithm in the neural network training is inherently sequential. This paper proposes a novel CNN parallel training method for image recognition. It overcomes the sequential property of the gradient descent and enables the parallel training with the speculative backpropagation. We found that the Softmax and ReLU outcomes in the forward propagation for the same labels are likely to be very similar. This characteristic makes it possible to perform the forward and backward propagation simultaneously. We implemented the proposed parallel model with CNNs in both software and hardware, and evaluated its performance. The parallel training reduces the training time by 34% in CIFAR-100 without the loss of the prediction accuracy compared to the sequential training. In many cases, it even improves the accuracy.

Highlights

Artificial neural networks (ANNs) have successfully been applied in various applications such as text recognition [1], image classification [2], and speech recognition [3]
As a deep neural networks (DNNs) model grows in size, there are a large number of vector-matrix multiplication (VMM) operations for training
We propose a novel idea of breaking the sequential property of the gradient descent algorithm for convolutional neural network (CNN) parallel training

Summary

INTRODUCTION

Artificial neural networks (ANNs) have successfully been applied in various applications such as text recognition [1], image classification [2], and speech recognition [3]. As a DNN model grows in size, there are a large number of vector-matrix multiplication (VMM) operations for training. The computational complexity of larger networks increases proportionally with the number of layers and parameters. It means that DNN requires a huge amount of time for training. The forward propagation should proceed before the backpropagation It is because the gradient descent algorithm is inherently sequential. We propose a novel idea of breaking the sequential property of the gradient descent algorithm for CNN parallel training. It enables performing the forward and backward propagations in parallel. The hardware accelerator exhibits a superior performance per watt because it only requires a 1.2 % of more memory

RELATED WORK

THE NEURAL NETWORK TRAINING

SPECULATIVE BACKPROPAGATION

IMPLEMENTATION AND OPTIMIZATION OF HW PARALLEL TRAINING

EVALUATION

DISCUSSION

VIII. CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2020
Citations: 31	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Speculative Backpropagation for CNN Parallel Training

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

Comparing Imperialist Competitive Algorithm With Backpropagation Algorithms For Training Feedforward Neural Network
Maryam Zanganeh ... Seyed Javad Mirabedini
Journal of Mathematics and Computer Science | VOL. 14
Maryam Zanganeh, et. al.Maryam Zanganeh ... Seyed Javad Mirabedini
15 Apr 2015
Journal of Mathematics and Computer Science | VOL. 14

Foundation of Deep Machine Learning in Neural Networks
Chih-Cheng Hung ... Yihua Lan
-
Chih-Cheng Hung, et. al.Chih-Cheng Hung ... Yihua Lan
01 Jan 2019
01 Jan 2019

Learning response times for WebSources: a comparison of a web prediction tool (WebPT) and a neural network
L Bright ... L Raschid
-
L Bright, et. al.L Bright ... L Raschid
01 Jan 1998
01 Jan 1998

Learning of neural networks from linguistic knowledge and numerical data
H Ishibuchi ... M Nii
-
H Ishibuchi, et. al.H Ishibuchi ... M Nii
12 Oct 1997
12 Oct 1997

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Speculative Backpropagation for CNN Parallel Training

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions