Beyond convexity-Contraction and global convergence of gradient descent.

Patrick M Wensing,Jean-Jacques Slotine,Fei Chen

doi:10.1371/journal.pone.0236661

Patrick M Wensing, Jean-Jacques Slotine + Show 1 more

Open Access

https://doi.org/10.1371/journal.pone.0236661

Copy DOI

Abstract

This paper considers the analysis of continuous time gradient-based optimization algorithms through the lens of nonlinear contraction theory. It demonstrates that in the case of a time-invariant objective, most elementary results on gradient descent based on convexity can be replaced by much more general results based on contraction. In particular, gradient descent converges to a unique equilibrium if its dynamics are contracting in any metric, with convexity of the cost corresponding to the special case of contraction in the identity metric. More broadly, contraction analysis provides new insights for the case of geodesically-convex optimization, wherein non-convex problems in Euclidean space can be transformed to convex ones posed over a Riemannian manifold. In this case, natural gradient descent converges to a unique equilibrium if it is contracting in any metric, with geodesic convexity of the cost corresponding to contraction in the natural metric. New results using semi-contraction provide additional insights into the topology of the set of optimizers in the case when multiple optima exist. Furthermore, they show how semi-contraction may be combined with specific additional information to reach broad conclusions about a dynamical system. The contraction perspective also easily extends to time-varying optimization settings and allows one to recursively build large optimization structures out of simpler elements. Extensions to natural primal-dual optimization and game-theoretic contexts further illustrate the potential reach of these new perspectives.

Highlights

This paper considers the analysis of continuous-time gradient-based optimization through the lens of nonlinear contraction theory
In special cases of a dense Hessian metric M(x) = r2 ψ(x) from a potential ψ(x), note that continuous mirror descent provides an alternate method to compute continuous natural gradient. These methods can avoid the need to invert the metric in cases where there is an explicit inverse exists for the change of variables z = rψ(x), or when (15) can be run at a fast time scale to invert the gradient map through dynamics
This paper has demonstrated that nonlinear contraction analysis provides a general perspective for analyzing and certifying the global convergence properties of gradient-based

Summary

Introduction

This paper considers the analysis of continuous-time gradient-based optimization through the lens of nonlinear contraction theory. Geodesic convexity [12, 13] generalizes convexity to a Riemannian setting, with applicability to optimization on manifolds [14], as well as to conventional Euclidean settings where Rn is endowed with a manifold structure through the definition of a metric We consider another class of conditions for the convergence of gradient and natural gradient descent to a globally optimal point. We consider the extensions of these results to natural gradient descent, where geodesic convexity of a function corresponds to contraction of its natural gradient system in the natural metric In both cases, results highlight the topology of the set of optimizers in the case of semi-contraction, which would have most direct applicability to over-parameterized networks.

Contraction analysis of gradient systems

Relationships between convexity and contraction

Relationship between geodesic convexity and contraction

Examples

À 3z12 1

Non-autonomous systems and virtual systems

Primal-dual optimization

Primal dual dynamics in natural adaptive control

Natural primal dual

Applying contraction tools to g-convex optimization

Sum of g-convex

Skew-symmetric feedback coupling

Hierarchical natural gradient

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PloS one	Publication Date: Aug 4, 2020
Citations: 30	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Beyond convexity-Contraction and global convergence of gradient descent.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one

Lead the way for us

Similar Papers

Gradient Descent for Non-convex Problems in Modern Machine Learning

-

27 Jun 2019
27 Jun 2019

Connections which are harmonic with respect to general natural metrics
Cornelia-Livia Bejan ... Simona-Luiza Druţă-Romaniuc
Differential Geometry and its Applications | VOL. 30
Cornelia-Livia Bejan, et. al.Cornelia-Livia Bejan ... Simona-Luiza Druţă-Romaniuc
10 May 2012
Differential Geometry and its Applications | VOL. 30

A nonlinear conjugate gradient method with complexity guarantees and its application to nonconvex regression
Rémi Chan–Renous-Legoubin ... Clément W Royer
EURO Journal on Computational Optimization | VOL. 10
Rémi Chan–Renous-Legoubin, et. al.Rémi Chan–Renous-Legoubin ... Clément W Royer
01 Jan 2021
EURO Journal on Computational Optimization | VOL. 10

The geometry of generalized Cheeger–Gromoll metrics on the total space of transitive Euclidean Lie algebroids
Mohamed Boucetta ... Hasna Essoufi
Journal of Geometry and Physics | VOL. 140
Mohamed Boucetta, et. al.Mohamed Boucetta ... Hasna Essoufi
11 Mar 2019
Journal of Geometry and Physics | VOL. 140

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Beyond convexity-Contraction and global convergence of gradient descent.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one