Abstract

Stochastic gradient descent (SGD) is popular for large scale optimization but has slow convergence. To remedy this problem, stochastic variance reduced gradient (SVRG) is proposed, which adopts average gradient to reduce the effect of variance. Since its expensive computational cost, average gradient is maintained between m iterations, where m is set to the same order of data size. For large scale problems, the efficiency will be decreased due to the prediction on average gradient maybe not accurate enough. We propose a method of using a mini-batch of samples to estimate average gradient, called stochastic mini-batch variance reduced gradient (SMVRG). SMVRG greatly reduces the computational cost of prediction on average gradient, therefore it is possible to estimate average gradient frequently thus more accurate. Numerical experiments show the effectiveness of our method in terms of convergence rate and computation cost.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call