Communication Scheduling for Gossip SGD in a Wide Area Network

Hideaki Oguni,Kazuyuki Shudo

doi:10.1109/access.2021.3083639

Abstract

Deep neural networks (DNNs) achieve higher accuracy as the amount of training data increases. However, training data such as personal medical data are often privacy sensitive and cannot be collected. Methods have been proposed for training with distributed data that remain in a wide area network. Due to heterogeneity in a wide area network, methods based on synchronous communication, such as all-reduce stochastic gradient descent (SGD), are not suitable, and gossip SGD is promising because it is based on asynchronous communication. Communication time is a problem in a wide area network. Gossip SGD cannot use double buffering that is a technique for hiding the communication time, since gossip SGD uses an asynchronous communication method. In this paper, we propose a type of gossip SGD in which computation and communication overlap to accelerate learning. The proposed method shares newer models by scheduling communication. To schedule the communication, the nodes share the information of the estimated communication time and communication-enabled nodes. This method is effective in both homogeneous and heterogeneous networks. The experimental results using the CIFAR-100 and Fashion-MNIST datasets demonstrate the faster convergence of the proposed method.

Highlights

Since AlexNet [1] won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) for object detection and image classification in 2012, deep neural networks (DNNs) have had an impact on many fields, including image recognition, speech recognition, and language processing
In the case of DNNs that are normally distributed on a single computer cluster but not a wide area network, methods using a parameter server to share the models [6]–[10] and all-reduce stochastic gradient descent (SGD) [4], [11]–[16], which is based on a synchronous all-reduce communication method to share the models, are studied
We propose a type of gossip SGD in which the computation and communication overlap

Summary

INTRODUCTION

Since AlexNet [1] won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) for object detection and image classification in 2012, deep neural networks (DNNs) have had an impact on many fields, including image recognition, speech recognition, and language processing. For dealing with a wide area network, gossip stochastic gradient descent (SGD) [3]–[5] is the most common method. In the case of DNNs that are normally distributed on a single computer cluster but not a wide area network, methods using a parameter server to share the models [6]–[10] and all-reduce SGD [4], [11]–[16], which is based on a synchronous all-reduce communication method to share the models, are studied. Shudo: Communication Scheduling for Gossip SGD in Wide Area Network. The computation on nodes and the communication between nodes overlap in some distributed DNNs based on synchronous communication to hide the communication time

RELATED WORK

GOSSIP SGD

2: Initialize

PROPOSED METHOD

1: Initialize

EXPERIMENT

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2021
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Communication Scheduling for Gossip SGD in a Wide Area Network

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Addressing the Heterogeneity of A Wide Area Network for DNNs
Hideaki Oguni ... Kazuyuki Shudo
-
Hideaki Oguni, et. al.Hideaki Oguni ... Kazuyuki Shudo
09 Jan 2021
09 Jan 2021

Directly and Indirectly Synchronous Communication Mechanisms for Client-Server Systems Using Event-Based Asynchronous Communication Framework
Mingyu Lim
IEEE Access | VOL. 7
Mingyu LimMingyu Lim
01 Jan 2019
IEEE Access | VOL. 7

Kokybiniai vartotojų elgsenos tyrimai internete: metodologiniai iššūkiai
Žaneta Paunksnienė ... Jūratė Banytė
Social Technologies | VOL. 3
Žaneta Paunksnienė, et. al.Žaneta Paunksnienė ... Jūratė Banytė
01 Jan 2013
Social Technologies | VOL. 3

A dynamic analysis of the interplay between asynchronous and synchronous communication in online learning: The impact of motivation
B Giesbers ... B Rienties
Journal of Computer Assisted Learning | VOL. 30
B Giesbers, et. al.B Giesbers ... B Rienties
12 Jun 2013
Journal of Computer Assisted Learning | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Communication Scheduling for Gossip SGD in a Wide Area Network

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access