Efficient Implementation of Second-Order Stochastic Approximation Algorithms in High-Dimensional Problems.

Jingyi Zhu,James C Spall,Long Wang

doi:10.1109/tnnls.2019.2935455

Abstract

Stochastic approximation (SA) algorithms have been widely applied in minimization problems when the loss functions and/or the gradient information are only accessible through noisy evaluations. Stochastic gradient (SG) descent-a first-order algorithm and a workhorse of much machine learning-is perhaps the most famous form of SA. Among all SA algorithms, the second-order simultaneous perturbation stochastic approximation (2SPSA) and the second-order stochastic gradient (2SG) are particularly efficient in handling high-dimensional problems, covering both gradient-free and gradient-based scenarios. However, due to the necessary matrix operations, the per-iteration floating-point-operations (FLOPs) cost of the standard 2SPSA/2SG is O(p3) , where p is the dimension of the underlying parameter. Note that the O(p3) FLOPs cost is distinct from the classical SPSA-based per-iteration O(1) cost in terms of the number of noisy function evaluations. In this work, we propose a technique to efficiently implement the 2SPSA/2SG algorithms via the symmetric indefinite matrix factorization and show that the FLOPs cost is reduced from O(p3) to O(p2) . The formal almost sure convergence and rate of convergence for the newly proposed approach are directly inherited from the standard 2SPSA/2SG. The improvement in efficiency and numerical stability is demonstrated in two numerical studies.

Full Text