Stochastic Variance Reduced Gradient Methods Using a Trust-Region-Like Scheme

Tengteng Yu,Yu-Hong Dai,Xin-Wei Liu,Jie Sun

doi:10.1007/s10915-020-01402-x

Abstract

Stochastic variance reduced gradient (SVRG) methods are important approaches to minimize the average of a large number of cost functions frequently arising in machine learning and many other applications. In this paper, based on SVRG, we propose a SVRG-TR method which employs a trust-region-like scheme for selecting stepsizes. It is proved that the SVRG-TR method is linearly convergent in expectation for smooth strongly convex functions and enjoys a faster convergence rate than SVRG methods. In order to overcome the difficulty of tuning stepsizes by hand, we propose to combine the Barzilai–Borwein (BB) method to automatically compute stepsizes for the SVRG-TR method, named as the SVRG-TR-BB method. By incorporating mini-batching scheme with SVRG-TR and SVRG-TR-BB, respectively, we further propose two extended methods mSVRG-TR and mSVRG-TR-BB. Linear convergence and complexity of mSVRG-TR are given. Numerical experiments on some standard datasets show that SVRG-TR and SVRG-TR-BB are generally better than or comparable to SVRG with best-tuned stepsizes and some modern stochastic gradient methods, while mSVRG-TR and mSVRG-TR-BB are very competitive with mini-batch variants of recent successful stochastic gradient methods.

Full Text