Hybrid-order distributed SGD: Balancing communication overhead, computational complexity, and convergence rate for distributed learning

Naeimeh Omidvar,Seyed Mohammad Hosseini,Mohammad Ali Maddah-Ali

doi:10.1016/j.neucom.2024.128020

Abstract

Communication overhead, computation load, and convergence speed are three major challenges in the scalability of distributed stochastic optimization algorithms in training large neural networks. In this paper, we propose the approach of hybrid-order distributed stochastic gradient descent (HO-SGD) which strikes a better balance between these three than the previous methods, for a general class of non-convex stochastic optimization problems. In particular, we advocate that by properly interleaving zeroth-order and first-order gradient updates, it is possible to significantly reduce the communication and computation overheads while guaranteeing a fast convergence. The proposed method guarantees the same order of convergence rate as in the fastest distributed methods (i.e., fully synchronous SGD) while having significantly less computational complexity and communication overhead per iteration, and the same order of communication overhead as in the state-of-the-art communication-efficient methods, with order-wisely less computational complexity. Moreover, it order-wisely improves the convergence rate of zeroth-order SGD methods. Finally and remarkably, empirical studies demonstrate that the proposed hybrid-order approach provides significantly higher test accuracies and superior generalization than all the baselines, owing to its novel exploration mechanism.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Hybrid-order distributed SGD: Balancing communication overhead, computational complexity, and convergence rate for distributed learning

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Similar Papers

Robust Conditional Privacy-Preserving Authentication based on Pseudonym Root with Cuckoo Filter in Vehicular Ad Hoc Networks
...
KSII Transactions on Internet and Information Systems | VOL. 13
, et. al. ...
31 Dec 2020
KSII Transactions on Internet and Information Systems | VOL. 13

Secure Public Auditing Using Batch Processing for Cloud Data Storage
G L Prakash ... Inder Singh
-
G L Prakash, et. al.G L Prakash ... Inder Singh
01 Jan 2018
01 Jan 2018

High-accuracy low-cost privacy-preserving federated learning in IoT systems via adaptive perturbation
Tian Liu ... Diep N Nguyen
Journal of Information Security and Applications | VOL. 70
Tian Liu, et. al.Tian Liu ... Diep N Nguyen
01 Sep 2022
Journal of Information Security and Applications | VOL. 70

A lightweight lattice-based security and privacy-preserving scheme for smart grid
Asmaa R Abdallah ... Xuemin Sherman Shen
-
Asmaa R Abdallah, et. al.Asmaa R Abdallah ... Xuemin Sherman Shen
01 Dec 2014
01 Dec 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hybrid-order distributed SGD: Balancing communication overhead, computational complexity, and convergence rate for distributed learning

Abstract

Talk to us

Similar Papers

More From: Neurocomputing