DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization

Boyue Li,Zhize Li,Yuejie Chi

doi:10.1137/21m1450677

Abstract

Emerging applications in multi-agent environments such as internet-of-things, networked sensing, autonomous systems and federated learning, call for decentralized algorithms for finite-sum optimizations that are resource-efficient in terms of both computation and communication. In this paper, we consider the prototypical setting where the agents work collaboratively to minimize the sum of local loss functions by only communicating with their neighbors over a predetermined network topology. We develop a new algorithm, called DEcentralized STochastic REcurSive gradient methodS (DESTRESS) for nonconvex finite-sum optimization, which matches the optimal incremental first-order oracle (IFO) complexity of centralized algorithms for finding first-order stationary points, while maintaining communication efficiency. Detailed theoretical and numerical comparisons corroborate that the resource efficiencies of DESTRESS improve upon prior decentralized algorithms over a wide range of parameter regimes. DESTRESS leverages several key algorithm design ideas including randomly activated stochastic recursive gradient updates with mini-batches for local computation, gradient tracking with extra mixing (i.e., multiple gossiping rounds) for per-iteration communication, together with careful choices of hyper-parameters and new analysis frameworks to provably achieve a desirable computation-communication trade-off.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization

Abstract

Talk to us

Similar Papers

More From: SIAM Journal on Mathematics of Data Science

Lead the way for us

Journal: SIAM Journal on Mathematics of Data Science	Publication Date: Aug 4, 2022
Citations: 2

Similar Papers

Stochastic gradient method with accelerated stochastic dynamics
Masayuki Ohzeki
Journal of Physics: Conference Series | VOL. 699
Masayuki OhzekiMasayuki Ohzeki
01 Mar 2016
Journal of Physics: Conference Series | VOL. 699

Biased stochastic conjugate gradient algorithm with adaptive step size for nonconvex problems
Ruping Huang ... Gonglin Yuan
Expert Systems With Applications | VOL. 238
Ruping Huang, et. al.Ruping Huang ... Gonglin Yuan
22 Sep 2023
Expert Systems With Applications | VOL. 238

Distributed Stochastic Gradient Tracking Algorithm With Variance Reduction for Non-Convex Optimization.
Xia Jiang ... Jie Chen
IEEE Transactions on Neural Networks and Learning Systems | VOL. PP
Xia Jiang, et. al.Xia Jiang ... Jie Chen
01 Sep 2023
IEEE Transactions on Neural Networks and Learning Systems | VOL. PP

Minimizing finite sums with the stochastic average gradient
Mark Schmidt ... Francis Bach
Mathematical Programming | VOL. 162
Mark Schmidt, et. al.Mark Schmidt ... Francis Bach
14 Jun 2016
Mathematical Programming | VOL. 162

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization

Abstract

Talk to us

Similar Papers

More From: SIAM Journal on Mathematics of Data Science