Understanding the acceleration phenomenon via high-resolution differential equations

Bin Shi,Michael I Jordan,Simon S Du,Weijie J Su

doi:10.1007/s10107-021-01681-8

Abstract

Gradient-based optimization algorithms can be studied from the perspective of limiting ordinary differential equations (ODEs). Motivated by the fact that existing ODEs do not distinguish between two fundamentally different algorithms—Nesterov’s accelerated gradient method for strongly convex functions (NAG-SC) and Polyak’s heavy-ball method—we study an alternative limiting process that yields high-resolution ODEs. We show that these ODEs permit a general Lyapunov function framework for the analysis of convergence in both continuous and discrete time. We also show that these ODEs are more accurate surrogates for the underlying algorithms; in particular, they not only distinguish between NAG-SC and Polyak’s heavy-ball method, but they allow the identification of a term that we refer to as “gradient correction” that is present in NAG-SC but not in the heavy-ball method and is responsible for the qualitative difference in convergence of the two methods. We also use the high-resolution ODE framework to study Nesterov’s accelerated gradient method for (non-strongly) convex functions, uncovering a hitherto unknown result—that NAG-C minimizes the squared gradient norm at an inverse cubic rate. Finally, by modifying the high-resolution ODE of NAG-C, we obtain a family of new optimization methods that are shown to maintain the accelerated convergence rates of NAG-C for smooth convex functions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematical Programming	Publication Date: Jul 6, 2021
Citations: 56	License type: open-access

R Discovery Prime

R Discovery Prime

Understanding the acceleration phenomenon via high-resolution differential equations

Abstract

Talk to us

Similar Papers

More From: Mathematical Programming

Lead the way for us

Similar Papers

Distributed Inertial Continuous and Discrete time Algorithms for Solving Resource Allocation Problem
You Zhao ... Xing He
IEEE Transactions on Network Science and Engineering | VOL. -
You Zhao, et. al.You Zhao ... Xing He
01 Jan 2023
IEEE Transactions on Network Science and Engineering | VOL. -

Convergence Rates of the Heavy Ball Method for Quasi-strongly Convex Optimization
J.-F Aujol ... A Rondepierre
SIAM Journal on Optimization | VOL. 32
J.-F Aujol, et. al.J.-F Aujol ... A Rondepierre
01 Aug 2022
SIAM Journal on Optimization | VOL. 32

The Essential Tools of Scientific Machine Learning (Scientific ML)
Christopher Rackauckas
-
Christopher RackauckasChristopher Rackauckas
20 Aug 2019
20 Aug 2019

The solution of differential equations using numerical Laplace transforms
Alan Davies
International Journal of Mathematical Education in Science and Technology | VOL. 30
Alan DaviesAlan Davies
01 Feb 1999
International Journal of Mathematical Education in Science and Technology | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Understanding the acceleration phenomenon via high-resolution differential equations

Abstract

Talk to us

Similar Papers

More From: Mathematical Programming