Abstract

We revisit the Ravine method of Gelfand and Tsetlin from a dynamical system perspective, study its convergence properties, and highlight its similarities and differences with the Nesterov accelerated gradient method. The two methods are closely related. They can be deduced from each other by reversing the order of the extrapolation and gradient operations in their definitions. They benefit from similar fast convergence of values and convergence of iterates for general convex objective functions. We will also establish the high resolution ODE of the Ravine and Nesterov methods and reveal an additional geometric damping term driven by the Hessian for both methods. This will allow us to prove fast convergence toward zero of the gradients not only for the Ravine method but also for the Nesterov method for the first time. In the strongly convex case, we show linear convergence for the Ravine method at an optimal rate. We also highlight connections to other algorithms resulting from more subtle discretization schemes and finally describe a Ravine version of the proximal-gradient algorithms for general structured smooth + nonsmooth convex optimization problems.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call