Abstract
The overwhelming parameters and computation consumption of deep neural networks limit their applicability to a single computing node with poor computing power, such as edge and mobile devices. Most previous works leverage model pruning and compression strategies to reduce DNN parameters for resource-constrained devices. However, most model compression methods may suffer from accuracy loss. Recently, we find that combine many weak computing nodes as a distributed system to run large and sophisticated DNN models is a promising solution for the issue. However, it is essential for the distributed system to design distributed DNN models and inference schemes, one of the great challenges of distributed system is how to design an efficient distributed DNN model for data parallelism and model parallelism, and communication overhead is also another critical performance bottleneck for distributed DNN model. Therefore, in this article, we propose DFSNet framework (Dividing-Fuse neural Network with Searching Strategy) for distributed DNN architecture. Firstly, the DFSNet framework includes a joint ”dividing-fusing” method to convert regular DNN models into distributed models that are friendly for distributed systems. This method divides the conventional DNN model in the channel dimension, and sets a few special layers to fuse feature-map information from different channel groups for accuracy improvement. Since the fusion layers are sparse in the network, they do not increase too much extra inference time and communication overhead on the distributed nodes, but they can maintain the accuracy of distributed neural networks significantly. Secondly, considering the architecture of distributed computing nodes, we propose a parallel fusion topology to improve the utilization of different computing nodes. Lastly, the popular weight-sharing neural architecture search (NAS) technique is leveraged to search the position of fusion layers in the distributed DNN model for high accuracy and finally generate an efficient distributed DNN model. Compared with the original network, our converted distributed DNN achieves better performance (e.g. 1.88% precision boosting in ResNet56 on CIFAR-100 dataset, and 1.25% precision improving in MobileNetV2 on ImageNet dataset). In addition, most layers of DNN have been divided into different distributed nodes on channel dimension, which is particularly suitable for distributed DNN architecture with very low communication overhead.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.