Abstract

We consider the minimization of a nonconvex quadratic form regularized by a cubic term, which may exhibit saddle points and a suboptimal local minimum. Nonetheless, we prove that, under mild assumptions, gradient descent approximates the global minimum to within $\varepsilon$ accuracy in $O(\varepsilon^{-1}\log(1/\varepsilon))$ steps for large $\varepsilon$ and $O(\log(1/\varepsilon))$ steps for small $\varepsilon$ (compared to a condition number we define), with at most logarithmic dependence on the problem dimension. When we use gradient descent to approximate the cubic-regularized Newton step, our result implies a rate of convergence to second-order stationary points of general smooth nonconvex functions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call