Stochastic heavy ball

Sébastien Gadat,Fabien Panloup,Sofiane Saadane

doi:10.1214/18-ejs1395

Abstract

This paper deals with a natural stochastic optimization procedure derived from the so-called Heavy-ball method differential equation, which was introduced by Polyak in the 1960s with his seminal contribution [Pol64]. The Heavy-ball method is a second-order dynamics that was investigated to minimize convex functions f. The family of second-order methods recently received a large amount of attention, until the famous contribution of Nesterov [Nes83], leading to the explosion of large-scale optimization problems. This work provides an in-depth description of the stochastic heavy-ball method, which is an adaptation of the deterministic one when only unbiased evalutions of the gradient are available and used throughout the iterations of the algorithm. We first describe some almost sure convergence results in the case of general non-convex coercive functions f. We then examine the situation of convex and strongly convex potentials and derive some non-asymptotic results about the stochastic heavy-ball method. We end our study with limit theorems on several rescaled algorithms.

Highlights

Finding the minimum of a function f over a set Ω with an iterative procedure is very popular among numerous scientific communities and has many applications in optimization, image processing, economics and statistics, to name a few
The most widespread approaches rely on some first-order strategies, with a sequence pXkqkě0 that evolves over Ω with a first-order recursive formula Xk1 “ ΨrXk, f pXkq, ∇f pXkqs that uses a local approximation of f at point Xk, where this approximation is built with the knowledge of f pXkq and ∇f pXkq alone
The complexity of each update involved in first-order methods is relatively limited and useful when dealing with a large-scale optimization problem, which is generally expensive in the case of Interior Point and Newton-like methods

Summary

Introduction

Finding the minimum of a function f over a set Ω with an iterative procedure is very popular among numerous scientific communities and has many applications in optimization, image processing, economics and statistics, to name a few. Among the available interpretations of NAGD, some recent advances have been proposed concerning the second-order dynamical system by [WSC16], being a particular case of the generalized Heavy Ball with Friction method (referred to as HBF in the text), as previously pointed out in [CEG09a, CEG09?b]. Even though the Robbins-Monro algorithm is able to achieve an optimal Op1{nq rate of convergence for strongly convex functions, its ability is highly sensitive to the step sizes used This remark led [PJ92] to develop an averaging method that makes it possible to use longer step sizes of the Robbins-Monro algorithm, and to average these iterates with a Cesaro procedure so that this method produces optimal results in the minimax sense (see [NY83]) for convex and strongly convex minimization problems, as pointed out in [BM11]. Other authors [GL13, GL16] obtained convergence results for the stochastic version of a variant of NAGD for non-convex optimization for gradient Lipschitz functions but these methods cannot be used for the analysis of the Heavy-ball algorithm. Appendix A consists of some important results on the supremum of certain random variables needed for the non-convex case

Deterministic Heavy Ball

Stochastic HBF

Baseline assumptions

Main results

Almost sure convergence of the stochastic heavy ball

Preliminary result

Convergence to a local minimum

Exponential memory rn “ r ą 0

Polynomial memory rn “ rΓn 1 ÝÑ 0

Quadratic case

Reduction to a two dimensional system

Exponential memory rn “ r

The non-quadratic case under exponential memory

Rescaling stochastic HBF

Tightness

Identification of the limit

Limit variance

Numerical experiments

Standard tools of stochastic algorithms

Step sizes γn “ γ nβ with β ă 1

Step sizes γn “ γ n1

Expectation of the supremum of the square of sub-Gaussian random variables

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronic Journal of Statistics	Publication Date: Jan 1, 2018
Citations: 60	License type: cc-by

R Discovery Prime

R Discovery Prime

Stochastic heavy ball

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics

Lead the way for us

Similar Papers

Understanding the acceleration phenomenon via high-resolution differential equations
Bin Shi ... Weijie J Su
Mathematical Programming | VOL. 195
Bin Shi, et. al.Bin Shi ... Weijie J Su
06 Jul 2021
Mathematical Programming | VOL. 195

Stochastic Euler Heavy Ball Method
Zhou Shao ... Junyan Liu
-
Zhou Shao, et. al.Zhou Shao ... Junyan Liu
01 Nov 2021
01 Nov 2021

Stochastic Heavy-Ball Method for Constrained Stochastic Optimization Problems
Porntip Promsinchai ... Ali Farajzadeh
Acta Mathematica Vietnamica | VOL. 45
Porntip Promsinchai, et. al.Porntip Promsinchai ... Ali Farajzadeh
16 Jan 2020
Acta Mathematica Vietnamica | VOL. 45

Analysis of a Two-Step Gradient Method with Two Momentum Parameters for Strongly Convex Unconstrained Optimization
Gerasim V Krivovichev ... Valentina Yu Sergeeva
Algorithms | VOL. 17
Gerasim V Krivovichev, et. al.Gerasim V Krivovichev ... Valentina Yu Sergeeva
18 Mar 2024
Algorithms | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Stochastic heavy ball

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics