Abstract
This work considers gradient descent for L-smooth convex optimization with stepsizes larger than the classic regime where descent can be ensured. The stepsize schedules considered are similar to but differ slightly from the recent silver stepsizes of Altschuler and Parrilo. For one of our stepsize sequences, we prove a [Formula: see text] convergence rate in terms of objective gap decrease and for the other, we show the same rate of decrease for squared-gradient-norm decrease. This first result improves on the recent result of Altschuler and Parrilo by a constant factor, while the second results improve on the exponent of the prior best squared-gradient-norm convergence guarantee of [Formula: see text]. Funding: B. Grimmer’s work was supported in part by the Air Force Office of Scientific Research under award number FA9550-23-1-0531.
Published Version
Join us for a 30 min session where you can share your feedback and ask us any queries you have