Abstract

Stochastic optimization (SO) algorithms based on the Powerball function, namely powered stochastic optimization (PoweredSO) algorithms, have been confirmed, effectively, and demonstrated great potential in the context of large-scale optimization and machine learning tasks. Nevertheless, the issue of how to determine the learning rate for PoweredSO is a challenge and still unsolved problem. In this paper, we propose a class of adaptive PoweredSO approaches that are efficient, scalable and robust. It takes advantage of the hypergradient descent (HD) technique to automatically acquire an online learning rate for PoweredSO-like methods. In the first part, we study the behavior of the canonical PoweredSO algorithm, the Powerball stochastic gradient descent (pbSGD) method, with HD. The existing PoweredSO algorithms also suffer from the high variance because they take the similar algorithmic framework to SO algorithms, arising from sampling tactics. Therefore, the second portion develops an adaptive powered variance-reduced optimization method via utilizing both variance-reduced technique and HD. Moreover, we present the convergence analysis of the proposed algorithms and explore their iteration complexity on non-convex cases. Numerical experiments are conducted on machine learning tasks, verifying the superior performance over modern SO algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call