Asynchronous Parallel, Sparse Approximated SVRG for High-Dimensional Machine Learning

Fanhua Shang,Hua Huang,Yuanyuan Liu,Jianhui Liu,Jun Fan,Hongying Liu

doi:10.1109/tkde.2021.3070539

Abstract

With the increasing of the data size and the development of multi-core computers, asynchronous parallel stochastic optimization algorithms such as KroMagnon have gained significant attention. In this paper, we propose a new Sparse approximation and asynchronous parallel Stochastic Variance Reduced Gradient (SSVRG) method for sparse and high-dimensional machine learning problems. Unlike standard SVRG and its asynchronous parallel variant, KroMagnon, the snapshot point of SSVRG is set to the average of all the iterates in the previous epoch, which allows it to take much larger learning rates and also makes it more robust to the choice of learning rates. In particular, we use the sparse approximation of the popular SVRG estimator to perform completely sparse updates at all iterations. Therefore, SSVRG has a much lower per-iteration computational cost than its dense counterpart, SVRG++, and is very friendly to asynchronous parallel implementation. Moreover, we provide the convergence guarantees of SSVRG for both strongly convex and non-strongly convex problems, while existing asynchronous algorithms (e.g., KroMagnon and ASAGA) only have convergence guarantees for strongly convex problems. Finally, we extend SSVRG to non-smooth and asynchronous parallel settings. Numerical experimental results demonstrate that SSVRG converges significantly faster than the state-of-the-art asynchronous parallel methods, e.g., KroMagnon, and is usually more than three orders of magnitude faster than SVRG++.

Full Text