Abstract
This work considers the minimization of a sum of an expectation-valued coordinate-wise smooth nonconvex function and a nonsmooth block-separable convex regularizer. We propose an asynchronous variance-reduced algorithm, where in each iteration, a single block is randomly chosen to update its estimates by a proximal variable sample-size stochastic gradient scheme, while the remaining blocks are kept invariant. Notably, each block employs a steplength relying on its block-specific Lipschitz constant while batch-sizes are updated as a function of the number of times that block is selected. We show that every limit point is a stationary point and establish the ergodic non-asymptotic rate . Iteration and oracle complexity to obtain an ε-stationary point are shown to be and , respectively. Furthermore, under a proximal Polyak–Łojasiewicz condition with batch sizes increasing at a geometric rate, we prove that the suboptimality diminishes at a geometric rate, the optimal deterministic rate while iteration and oracle complexity to obtain an ε-optimal solution are and with . In the single block setting, we obtain the optimal oracle complexity . Finally, preliminary numerics suggest that the schemes compare well with competitors reliant on global Lipschitz constants.
Submitted Version (
Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have