We investigate convex differentiable optimization and explore the temporal discretization of damped inertial dynamics driven by the gradient of the objective function. This leads to three accelerated gradient algorithms: Nesterov Accelerated Gradient (NAG), Ravine Accelerated Gradient (RAG), and (IGAHD). The latter was introduced by discretizing inertial dynamics with Hessian-driven damping to attenuate inherent oscillations in inertial methods. By analysing the high-resolution ODEs of order p = 0,1,2 for these algorithms, we gain insights into their similarities and differences. All three algorithms share the same low-resolution ODE of order 0, which is the dynamic proposed as a continuous surrogate for (NAG). To differentiate Nesterov from Ravine, we refine the comparison and demonstrate distinct high-resolution ODEs of order 2 in h (termed super-resolution). The corresponding Taylor expansions in h reveal matching terms of order 1 but differing terms of order 2. To the best of our knowledge, this result is completely new and emphasizes the need to avoid confusion between the Ravine and Nesterov methods in the literature. We present numerical experiments to illustrate our theoretical results. Performance profiles, measuring the number of iterations, indicate that (IGAHD) outperforms both (NAG) and (RAG) methods. (RAG) exhibits a slight advantage over (NAG) in terms of the average number of iterations. When considering CPU-time, both (RAG) and (NAG) outperform (IGAHD). All three algorithms exhibit similar behaviour when evaluating based on gradient norms.
Read full abstract