N-SVRG: Stochastic Variance Reduction Gradient with Noise Reduction Ability for Small Batch Samples

Haijie Pan,Lirong Zheng

doi:10.32604/cmes.2022.019069

Abstract

The machine learning model converges slowly and has unstable training since large variance by random using a sample estimate gradient in SGD. To this end, we propose a noise reduction method for Stochastic Variance Reduction gradient (SVRG), called N-SVRG, which uses small batches samples instead of all samples for the average gradient calculation, while performing an incremental update of the average gradient. In each round of iteration, a small batch of samples is randomly selected for the average gradient calculation, while the average gradient is updated by rounding of the past model gradients during internal iterations. By suitably reducing the batch size B, the memory storage as well as the number of iterations can be reduced. The experiments are compared with the state-of-the-art Mini-Batch SGD, AdaGrad, RMSProp, SVRG and SCSG, and it is demonstrated that N-SVRG outperforms SVRG and SASG, and is on par with SCSG. Finally, by exploring the relationship between the small values of different parameters n. B and k and the effectiveness of the algorithm, we prove that our N-SVRG algorithm has some stability and can achieve sufficient accuracy even in the case of small batch size. The advantages and disadvantages of various methods are experimentally compared, and the stability of N-SVRG is explored by parameter settings.

Highlights

The variance problem introduced by the stochastic nature of the SGD algorithm becomes the main problem of optimization algorithms nowadays
In order to address the above challenges, we propose a noise reduction method of stochastic gradient method, and use the idea of small sample average gradient instead of global average gradient to design the algorithm N-Stochastic Variance Reduction gradient (SVRG) that selects small samples for training while updating the average gradient to achieve variance reduction, and introduce the algorithm flow and convergence analysis of N-SVRG algorithm in detail, and compare it with the mainstream Mini-Batch SGD The N-SVRG algorithm is compared with the mainstream Mini-Batch SGD, AdaGrad, RMSProp, SVRG and SCSG algorithms, and it is proved that the N-SVRG algorithm outperforms SVRG, SASG and other algorithms, and is equal to SCSG
B and k and the effectiveness of the algorithm, we prove that the N-SVRG algorithm has some stability and can achieve sufficient accuracy even in the case of low back size

Summary

Introduction

The variance problem introduced by the stochastic nature of the SGD algorithm becomes the main problem of optimization algorithms nowadays. The introduction of variance makes SGD reach only sublinear convergence speed with a fixed step size [1], while the stochastic algorithm accuracy is positively related to the sampling variance, and when the variance tends to 0, the deviation of the algorithm will be 0. In this case, the SGD can still be fast even with a large step size convergence. Came the improved Mini-Batch SGD (MBGD) algorithm, where MBGD computes the gradient and performs weight update by randomly selecting m data samples in the original data for each iteration. Since each sample corresponds to a loss function, the empirical risk is the average of these n sample loss functions.

Objectives

Methods

Findings

Discussion

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computer Modeling in Engineering & Sciences	Publication Date: Jan 1, 2022
Citations: 2	License type: cc-by

R Discovery Prime

R Discovery Prime

N-SVRG: Stochastic Variance Reduction Gradient with Noise Reduction Ability for Small Batch Samples

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computer Modeling in Engineering & Sciences

Lead the way for us

Similar Papers

Efficient Dual Batch Size Deep Learning for Distributed Parameter Server Systems
Kuan-Wei Lu ... Jan-Jan Wu
-
Kuan-Wei Lu, et. al.Kuan-Wei Lu ... Jan-Jan Wu
01 Jun 2022
01 Jun 2022

Variance Reduction for Deep Q-Learning Using Stochastic Recursive Gradient
Haonan Jia ... Jun Xu
-
Haonan Jia, et. al.Haonan Jia ... Jun Xu
01 Jan 2023
01 Jan 2023

Program Synthesis with Genetic Programming: The Influence of Batch Sizes
Dominik Sobania ... Franz Rothlauf
-
Dominik Sobania, et. al.Dominik Sobania ... Franz Rothlauf
01 Jan 2021
01 Jan 2021

Iteration and stochastic first-order oracle complexities of stochastic gradient descent using constant and decaying learning rates
Kento Imaizumi ... Hideaki Iiduka
Optimization | VOL. ahead-of-print
Kento Imaizumi, et. al.Kento Imaizumi ... Hideaki Iiduka
19 Jun 2024
Optimization | VOL. ahead-of-print

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

N-SVRG: Stochastic Variance Reduction Gradient with Noise Reduction Ability for Small Batch Samples

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computer Modeling in Engineering & Sciences