We prove lower bounds on the complexity of finding $$\epsilon $$ -stationary points (points x such that $$\Vert \nabla f(x)\Vert \le \epsilon $$ ) of smooth, high-dimensional, and potentially non-convex functions f. We consider oracle-based complexity measures, where an algorithm is given access to the value and all derivatives of f at a query point x. We show that for any (potentially randomized) algorithm $$\mathsf {A}$$ , there exists a function f with Lipschitz pth order derivatives such that $$\mathsf {A}$$ requires at least $$\epsilon ^{-(p+1)/p}$$ queries to find an $$\epsilon $$ -stationary point. Our lower bounds are sharp to within constants, and they show that gradient descent, cubic-regularized Newton’s method, and generalized pth order regularization are worst-case optimal within their natural function classes.
Read full abstract