Abstract

Abstract A significant theoretical advantage of high-order optimization methods is their superior convergence guarantees. For instance, third-order regularized methods reach an $(\epsilon _1,\epsilon _2,\epsilon _3)$third-order critical point in at most ${\mathcal {O}} (\max (\epsilon _1^{-4/3}, \epsilon _2^{-2}, \epsilon _3^{-4} ) )$ iterations. However, the cost of computing high-order derivatives is prohibitively expensive in real applications, including for instance many real-world machine learning tasks. In order to address this problem, we present a sub-sampled optimization method that uses a third-order regularized model to find local minima of smooth and potentially nonconvex objective functions with a finite-sum structure. This algorithm uses sub-sampled derivatives instead of exact quantities and is guaranteed to converge to a third-order critical point. Our analysis relies on a novel tensor concentration inequality for sums of tensors of any order that makes explicit use of the finite-sum structure of the objective function.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call