Lower error bounds for the stochastic gradient descent optimization algorithm: Sharp convergence rates for slowly and fast decaying learning rates

Arnulf Jentzen,Philippe Von Wurstemberger

doi:10.1016/j.jco.2019.101438

Lower error bounds for the stochastic gradient descent optimization algorithm: Sharp convergence rates for slowly and fast decaying learning rates

Arnulf Jentzen, Philippe Von Wurstemberger

Open Access

https://doi.org/10.1016/j.jco.2019.101438

Copy DOI

Journal: Journal of Complexity	Publication Date: Sep 27, 2019
Citations: 25	License type: publisher-specific-oa

Affiliation: ETH Zurich

#Stochastic Gradient Descent Optimization Algorithm #Solutions Of Stochastic Optimization Problems + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

The stochastic gradient descent (SGD) optimization algorithm is one of the central tools used to approximate solutions of stochastic optimization problems arising in machine learning and, in particular, deep learning applications. It is therefore important to analyze the convergence behavior of SGD. In this article we consider a simple quadratic stochastic optimization problem and establish for every γ,ν∈(0,∞) essentially matching lower and upper bounds for the mean square error of the associated SGD process with learning rates (γnν)n∈N. This allows us to precisely quantify the mean square convergence rate of the SGD method in dependence on the choice of the learning rates.

Full Text