Weighted SGD for ep regression with randomized preconditioning

Jiyan Yang ,Christopher Ré ,Yinlam Chow ,Michael W Mahoney

doi:10.5555/2884435.2884476

Abstract

In recent years, stochastic gradient descent (SGD) methods and randomized linear algebra (RLA) algorithms have been applied to many large-scale problems in machine learning and data analysis. SGD methods are easy to implement and applicable to a wide range of convex optimization problems. In contrast, RLA algorithms provide much stronger worst-case performance guarantees but are applicable to a narrower class of problems. We aim to bridge the gap between these two classes of methods in solving constrained overdetermined linear regression problems---e.g., e2 and e1 regression problems.• We propose a hybrid algorithm named pwSGD that uses RLA techniques for preconditioning and constructing an importance sampling distribution, and then performs an SGD-like iterative process with weighted sampling on the preconditioned system.• By rewriting the ep regression problem into a stochastic optimization problem, we connect pwSGD to several existing ep solvers including RLA methods with algorithmic leveraging (RLA for short).• We prove that pwSGD inherits faster convergence rates that only depend on the lower dimension of the linear system, while maintaining low computation complexity. Such SGD convergence rate is superior to other related SGD algorithms such as the weighted randomized Kaczmarz algorithm.• Particularly, when solving e1 regression with size n by d, PWSGD returns an approximate solution with e relative error on the objective value in O(log n · nnz(A) + poly(d)/e2) time. This complexity is uniformly better than that of RLA methods in terms of both e and d when the problem is unconstrained. In the presence of constraints, pwSGD only has to solve a sequence of much simpler and smaller optimization problem over the same constraints. In general this is more efficient than solving the constrained subproblem required in RLA.• For e2 regression, pwSGD returns an approximate solution with e relative error on the objective value and solution vector in prediction norm in O(log n · nnz(A) + poly(d) log(1/e)/e) time. We show that when solving unconstrained e2 regression, this complexity is comparable to that of RLA and is asymptotically better over several state-of-the-art solvers in the regime where the desired accuracy e, high dimension n and low dimension d satisfy d ≥ 1/e and n ≥d2/e.Finally, the effectiveness of such algorithms is illustrated numerically on both synthetic and real datasets, and the results are consistent with our theoretical findings and demonstrate that pwSGD converges to a medium-precision solution, e.g., e = 10--3, more quickly than other methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Weighted SGD for ep regression with randomized preconditioning

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Weighted SGD for ℓ p Regression with Randomized Preconditioning.
Jiyan Yang ... Yin-Lam Chow
Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms | VOL. 2016
Jiyan Yang, et. al.Jiyan Yang ... Yin-Lam Chow
21 Dec 2015
Proceedings of the ... Annual ACM-SIAM Symposium on Discrete Algorithms. ACM-SIAM Symposium on Discrete Algorithms | VOL. 2016

Kalman-Based Stochastic Gradient Method with Stop Condition and Insensitivity to Conditioning
Vivak Patel
SIAM Journal on Optimization | VOL. 26
Vivak PatelVivak Patel
01 Jan 2015
SIAM Journal on Optimization | VOL. 26

Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods
Nicolas Loizou ... Peter Richtárik
Computational Optimization and Applications | VOL. 77
Nicolas Loizou, et. al.Nicolas Loizou ... Peter Richtárik
23 Sep 2020
Computational Optimization and Applications | VOL. 77

Averaged Stochastic Optimization for Medical Image Registration Based on Variance Reduction
Wei Sun ... Dirk H J Poot
-
Wei Sun, et. al.Wei Sun ... Dirk H J Poot
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Weighted SGD for ep regression with randomized preconditioning

Abstract

Talk to us

Similar Papers