Abstract
In this paper, we consider supervised learning problems over training sets in which the number of training examples and the dimension of feature vectors are both large. We focus on the case where the loss function defining the quality of the parameter we wish to estimate may be non-convex, but also has a convex regularization. We propose a Doubly Stochastic Successive Convex approximation scheme (DSSC) able to handle non-convex regularized expected risk minimization. The method operates by decomposing the decision variable into blocks and operating on random subsets of blocks at each step (fusing the merits of stochastic approximation with block coordinate methods), and then implements successive convex approximation. In contrast to many stochastic convex methods whose almost sure behavior is not guaranteed in non-convex settings, DSSC attains almost sure convergence to a stationary solution of the problem. Moreover, we show that the proposed DSSC algorithm achieves stationarity at a rate of O((log t)/t <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1/4</sup> ). Numerical experiments on a non-convex variant of a lasso regression problem show that DSSC performs favorably in this setting. We then apply this method to the task of dictionary learning from high-dimensional visual data collected from a ground robot, and observe reliable convergence behavior for a difficult non-convex stochastic program.
Accepted Version
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have