FiLayer: A Novel Fine-Grained Layer-Wise Parallelism Strategy for Deep Neural Networks

Wenbin Jiang,Geyan Ye,Pai Liu,Hai Jin,Yangsong Zhang

doi:10.1007/978-3-030-01424-7_32

Abstract

Data parallelism and model parallelism are regarded as two major parallelism strategies for deep neural networks (DNNs). However, the two methodologies achieve acceleration mainly by applying coarse-grained network-model-based parallelization. Neither methodology can fully tap into the potentials of the parallelism of network models and many-core systems (such as GPUs). In this work, we propose a novel fine-grained parallelism strategy based on layer-wise parallelization (named FiLayer), which includes inter-layer parallelism and intra-layer parallelism. The former allows several adjacent layers in a network model to be processed in a pipelined manner. The latter divides the operations in one layer into several parts and processes them in parallel. CUDA streams are applied to realize the above fine-grained parallelisms. FiLayer is implemented by extending Caffe. Several typical datasets are used for the performance evaluation. The experimental results indicate that FiLayer can help Caffe achieve speedups of \(1.58{\times }\)–\(2.19{\times }\).

Full Text