Abstract

When training convolutional neural networks, a large amount of operations and memory access are in need, which easily lead to the bottleneck of “memory wall” and decrease the computational performance and efficiency. Batch Normalization (BN) can effectively speed up the deep network training convergence, but it has complex data dependence and causes more serious “memory wall” bottleneck. Aiming at the “memory wall” problem occurred in the training for convolutional neural network using BN algorithm, the training method with splitting BN layer and multi-layer fusion calculation is proposed to reduce the memory access in model training. Firstly, by reordering “CONV+BN+RELU” (CBR) block, we trade computation for memory access with extra computation to reduce data accessed during training. Secondly, according to the memory access characteristics of the BN layer, the BN layer is divided into two sub-layers, which are respectively fused with the adjacent layers and the CBR block is recombined into “BN_B+RELU+CONV+BN_A” (BRCB), which further reduces the read-write of the main memory during training and alleviates the “memory wall” bottleneck to improve accelerator computational efficiency. The experimental results show that when using the NVIDIA TESLA V100 GPU to train ResNet-50, Inception V3 and DenseNet models, compared with the original training method, the amount of data accessed using BRCB multi-layer fusion optimization method is reduced by 33%, 22% and 31% respectively, and the actual computing efficiency of V100 is improved by 19%, 18% and 21% respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.