An In-Depth Analysis of Distributed Training of Deep Neural Networks

Yunyong Ko,Jiwon Seo,Kibong Choi,Sang-Wook Kim

doi:10.1109/ipdps49936.2021.00108

Abstract

As the popularity of deep learning in industry rapidly grows, efficient training of deep neural networks (DNNs) becomes important. To train a DNN with a large amount of data, distributed training with data parallelism has been widely adopted. However, the communication overhead limits the scalability of distributed training. To reduce the overhead, a number of distributed training algorithms have been proposed. The model accuracy and training performance of those algorithms can be different depending on various factors such as cluster settings, training models/datasets, and optimization techniques applied. In order for someone to adopt a distributed training algorithm appropriate for her/his situation, it is required for her/him to fully understand the model accuracy and training performance of these algorithms in various settings. Toward this end, this paper reviews and evaluates seven popular distributed training algorithms (BSP, ASP, SSP, EASGD, AR-SGD, GoSGD, and AD-PSGD) in terms of the model accuracy and training performance in various settings. Specifically, we evaluate those algorithms for two CNN models, in different cluster settings, and with three well-known optimization techniques. Through extensive evaluation and analysis, we made several interesting discoveries. For example, we found out that some distributed training algorithms (SSP, EASGD, and GoSGD) have highly negative impact on the model accuracy because they adopt intermittent and asymmetric communication to improve training performance; the communication overhead of some centralized algorithms (ASP and SSP) is much higher than we expected in a cluster setting with limited network bandwidth because of the PS bottleneck problem. These findings, and many more in the paper, can guide the adoption of proper distributed training algorithms in industry; our findings can be useful in academia as well for designing new distributed training algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An In-Depth Analysis of Distributed Training of Deep Neural Networks

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

AccDP: Accelerated Data-Parallel Distributed DNN Training for Modern GPU-Based HPC Clusters
Nawras Alnaasan ... Hari Subramoni
-
Nawras Alnaasan, et. al.Nawras Alnaasan ... Hari Subramoni
01 Dec 2022
01 Dec 2022

Neuroevolution in Deep Neural Networks: Current Trends and Future Challenges
Edgar Galvan ... Peter Mooney
IEEE Transactions on Artificial Intelligence | VOL. 2
Edgar Galvan, et. al.Edgar Galvan ... Peter Mooney
04 May 2021
IEEE Transactions on Artificial Intelligence | VOL. 2

A Guessing Entropy-Based Framework for Deep Learning-Assisted Side-Channel Analysis
Ziyue Zhang ... Yunsi Fei
IEEE Transactions on Information Forensics and Security | VOL. 18
Ziyue Zhang, et. al.Ziyue Zhang ... Yunsi Fei
01 Jan 2023
IEEE Transactions on Information Forensics and Security | VOL. 18

A Network-Centric Hardware/Algorithm Co-Design to Accelerate Distributed Training of Deep Neural Networks
Youjie Li ... Peitian Pan
-
Youjie Li, et. al.Youjie Li ... Peitian Pan
01 Oct 2018
01 Oct 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An In-Depth Analysis of Distributed Training of Deep Neural Networks

Abstract

Talk to us

Similar Papers