Minimizing finite sums with the stochastic average gradient

Mark Schmidt,Nicolas Le Roux,Francis Bach

doi:10.1007/s10107-016-1030-6

Abstract

We analyze the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions. Like stochastic gradient (SG) methods, the SAG method's iteration cost is independent of the number of terms in the sum. However, by incorporating a memory of previous gradient values the SAG method achieves a faster convergence rate than black-box SG methods. The convergence rate is improved from $$O(1/\sqrt{k})$$O(1/k) to O(1 / k) in general, and when the sum is strongly-convex the convergence rate is improved from the sub-linear O(1 / k) to a linear convergence rate of the form $$O(\rho ^k)$$O(?k) for $$\rho < 1$$?<1. Further, in many cases the convergence rate of the new method is also faster than black-box deterministic gradient methods, in terms of the number of gradient evaluations. This extends our earlier work Le Roux et al. (Adv Neural Inf Process Syst, 2012), which only lead to a faster rate for well-conditioned strongly-convex problems. Numerical experiments indicate that the new algorithm often dramatically outperforms existing SG and deterministic gradient methods, and that the performance may be further improved through the use of non-uniform sampling strategies.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Minimizing finite sums with the stochastic average gradient

Abstract

Talk to us

Similar Papers

More From: Mathematical Programming

Lead the way for us

Journal: Mathematical Programming	Publication Date: Jun 14, 2016
Citations: 795

Similar Papers

A Novel Stochastic Stratified Average Gradient Method: Convergence Rate and Its Complexity
Aixiang Andy Chen ... Rui Bian
-
Aixiang Andy Chen, et. al.Aixiang Andy Chen ... Rui Bian
01 Jul 2018
01 Jul 2018

CSG: A new stochastic gradient method for the efficient solution of structural optimization problems with infinitely many states
Lukas Pflug ... Max Grieshammer
Structural and Multidisciplinary Optimization | VOL. 61
Lukas Pflug, et. al.Lukas Pflug ... Max Grieshammer
31 May 2020
Structural and Multidisciplinary Optimization | VOL. 61

Stochastic gradient method with accelerated stochastic dynamics
Masayuki Ohzeki
Journal of Physics: Conference Series | VOL. 699
Masayuki OhzekiMasayuki Ohzeki
01 Mar 2016
Journal of Physics: Conference Series | VOL. 699

Biased stochastic conjugate gradient algorithm with adaptive step size for nonconvex problems
Ruping Huang ... Gonglin Yuan
Expert Systems With Applications | VOL. 238
Ruping Huang, et. al.Ruping Huang ... Gonglin Yuan
22 Sep 2023
Expert Systems With Applications | VOL. 238

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Minimizing finite sums with the stochastic average gradient

Abstract

Talk to us

Similar Papers

More From: Mathematical Programming