Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods

Nicolas Loizou,Peter Richtárik

doi:10.1007/s10589-020-00220-z

Abstract

In this paper we study several classes of stochastic optimization algorithms enriched with heavy ball momentum. Among the methods studied are: stochastic gradient descent, stochastic Newton, stochastic proximal point and stochastic dual subspace ascent. This is the first time momentum variants of several of these methods are studied. We choose to perform our analysis in a setting in which all of the above methods are equivalent: convex quadratic problems. We prove global non-asymptotic linear convergence rates for all methods and various measures of success, including primal function values, primal iterates, and dual function values. We also show that the primal iterates converge at an accelerated linear rate in a somewhat weaker sense. This is the first time a linear rate is shown for the stochastic heavy ball method (i.e., stochastic gradient descent method with momentum). Under somewhat weaker conditions, we establish a sublinear convergence rate for Cesaro averages of primal iterates. Moreover, we propose a novel concept, which we call stochastic momentum, aimed at decreasing the cost of performing the momentum step. We prove linear convergence of several stochastic methods with stochastic momentum, and show that in some sparse data regimes and for sufficiently small momentum parameters, these methods enjoy better overall complexity than methods with deterministic momentum. Finally, we perform extensive numerical testing on artificial and real datasets, including data coming from average consensus problems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods

Abstract

Talk to us

Similar Papers

More From: Computational Optimization and Applications

Lead the way for us

Journal: Computational Optimization and Applications	Publication Date: Sep 23, 2020
Citations: 67

Similar Papers

Kalman-Based Stochastic Gradient Method with Stop Condition and Insensitivity to Conditioning
Vivak Patel
SIAM Journal on Optimization | VOL. 26
Vivak PatelVivak Patel
01 Jan 2015
SIAM Journal on Optimization | VOL. 26

Numerical methods for distributed stochastic compositional optimization problems with aggregative structure
Shengchao Zhao ... Yongchao Liu
Optimization Methods and Software | VOL. ahead-of-print
Shengchao Zhao, et. al.Shengchao Zhao ... Yongchao Liu
25 Jul 2024
Optimization Methods and Software | VOL. ahead-of-print

Bi-fidelity stochastic gradient descent for structural optimization under uncertainty
Subhayan De ... Kurt Maute
Computational Mechanics | VOL. 66
Subhayan De, et. al.Subhayan De ... Kurt Maute
03 Aug 2020
Computational Mechanics | VOL. 66

Fast identification of a human skeleton-marker model for motion capture system using stochastic gradient descent method
Tianyi Zou ... Tomomichi Sugihara
-
Tianyi Zou, et. al.Tianyi Zou ... Tomomichi Sugihara
22 Oct 2020
22 Oct 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods

Abstract

Talk to us

Similar Papers

More From: Computational Optimization and Applications