Dynamic Look-up Table Method for Optimizing the Training of Deep Neural Networks on Many-Core Architecture

Jing Xia,Xinfa Dai,Zhong Ma,Pengyuan He,Li He,Weihua Huang

doi:10.1109/iccsnt47585.2019.8962413

Jing Xia, Xinfa Dai + Show 4 more

https://doi.org/10.1109/iccsnt47585.2019.8962413

Copy DOI

Export

Save

Cite

Publication Date: Oct 1, 2019

Abstract
Full-Text
Similar Papers

Abstract

Listen

Due to the excessive training parameters and computation of Deep Neural Network (DNN) models, we have witnessed the training time increases with the continued increase of the scale of DNN models. Convolution computation is the key step of feature extraction in DNN models and occupies about 90% of the computation operations in DNN models. It is therefore of great necessity to accelerate the speed of convolution calculation in order to improve the training efficiency of system. Currently, the traditional method is to transform convolution computation into matrix multiplication and execute on many-core architecture such as Graphic Processing Unit (GPU). However, due to matrix conversion and high computational complexity of matrix multiplication, these methods based on matrix multiplication such as Caffe, consume a lot of time in accessing memory and need to copy redundant data in process of matrix conversion. The low capacity of GPU memory further leads to the low training efficiency of DNN models. Therefore, we orchestrate a dynamic look-up table method instead of matrix multiplication to realize convolution calculation in order to optimize the training of DNN on many-core architecture. We further improve the parallelism of the look-up table method in a more fine-grained parallelism by paralleling the building convolution table and look-up table operation based on GPU. In our experiment, we tested and trained MINIST, CIFAR-10, CIFAR-100 and ImageNet data sets respectively. Experiments show that, compared to the original Caffe, the proposed dynamic look-up table method referred as LTP-Caffe can achieve up to 31% speed-up ratio in the training process of DNN models. Experiments further show that LTP-Caffe and Caffe are comparable in accuracy, but the iteration speed of LTP-Caffe is more quickly than the Caffe.

Full Text