Gradient preconditioned mini-batch SGD for ridge regression

Zhuan Zhang,Shuisheng Zhou,Dong Li,Ting Yang

doi:10.1016/j.neucom.2020.06.092

Abstract

Data preconditioning technique, which reduces the condition number of the problem by a linear transformation of the data matrix, is typically used to accelerate the convergence of the first-order optimization methods for regularized loss minimization. One obvious limitation of the technique is exceedingly expensive of computational cost for the large-scale problems, especially an ocean of samples. In this paper, we have a gradient preconditioning trick and combine it with mini-batch SGD. The proposed gradient preconditioned mini-batch SGD algorithm boosts indeed the convergence with lower computational cost than that of the data preconditioning technique for ridge regression. Concretely, we use recent random projection and linear sketching methods to randomly low rank approximate the data matrix, then we can achieve a appropriate preconditioner through numerical linear algebra. Finally, we apply obtained preconditioner to the gradient to reduce computational cost. The experimental results on both synthetic data and real data sets validate the feasibility and effectiveness of our trick and algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Gradient preconditioned mini-batch SGD for ridge regression

Abstract

Talk to us

Similar Papers

More From: Neurocomputing

Lead the way for us

Journal: Neurocomputing	Publication Date: Jul 8, 2020
Citations: 8

Similar Papers

On-Line Anomaly Detection With High Accuracy
Kun Xie ... Jiannong Cao
IEEE/ACM Transactions on Networking | VOL. 26
Kun Xie, et. al.Kun Xie ... Jiannong Cao
01 Jun 2018
IEEE/ACM Transactions on Networking | VOL. 26

A Novel Revocable Lightweight Authentication Scheme for Resource-Constrained Devices in Cyber–Physical Power Systems
Xue Li ... Lei Wu
IEEE Internet of Things Journal | VOL. 10
Xue Li, et. al.Xue Li ... Lei Wu
15 Mar 2023
IEEE Internet of Things Journal | VOL. 10

Disentangled convolution for optimizing receptive field
Takumi Kobayashi
Pattern Recognition Letters | VOL. 169
Takumi KobayashiTakumi Kobayashi
02 Apr 2023
Pattern Recognition Letters | VOL. 169

Visco-acoustic data modeling using optimum layer approximation technique: greenwood oil field example, USA
W.O Raji ... I.O Folorunsho
Ife Journal of Science | VOL. 21
W.O Raji, et. al.W.O Raji ... I.O Folorunsho
02 Apr 2019
Ife Journal of Science | VOL. 21

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Gradient preconditioned mini-batch SGD for ridge regression

Abstract

Talk to us

Similar Papers

More From: Neurocomputing