Abstract
We consider supervised learning problems over training sets in which both the number of training examples and the dimension of the feature vectors are large. We focus on the case where the loss function defining the quality of the parameter we wish to estimate may be non-convex, but also has a convex regularization. We propose a Doubly Stochastic Successive Convex approximation scheme (DSSC) able to handle non-convex regularized expected risk minimization. The method operates by decomposing the decision variable into blocks and operating on random subsets of blocks at each step. The algorithm belongs to the family of successive convex approximation methods since we replace the original non-convex stochastic objective by a strongly convex sample surrogate function, and solve the resulting convex program, for each randomly selected block in parallel. The method operates on subsets of features (block coordinate methods) and training examples (stochastic approximation) at each step. In contrast to many stochastic convex methods whose almost sure behavior is not guaranteed in non-convex settings, DSSC attains almost sure convergence to a stationary solution of the problem. Numerical experiments on a non-convex variant of a lasso regression problem show that DSSC performs favorably in this setting.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have