Stochastic Composition Optimization of Functions Without Lipschitz Continuous Gradient

Yin Liu,Sam Davanloo Tajbakhsh

doi:10.1007/s10957-023-02180-w

Abstract

In this paper, we study stochastic optimization of two-level composition of functions without Lipschitz continuous gradient. The smoothness property is generalized by the notion of relative smoothness which provokes the Bregman gradient method. We propose three stochastic composition Bregman gradient algorithms for the three possible relatively smooth compositional scenarios and provide their sample complexities to achieve an $$\epsilon $$ -approximate stationary point. For the smooth of relatively smooth composition, the first algorithm requires $$\mathcal {O}(\epsilon ^{-2})$$ calls to the stochastic oracles of the inner function value and gradient as well as the outer function gradient. When both functions are relatively smooth, the second algorithm requires $$\mathcal {O}(\epsilon ^{-3})$$ calls to the inner function value stochastic oracle and $$\mathcal {O}(\epsilon ^{-2})$$ calls to the inner and outer functions gradients stochastic oracles. We further improve the second algorithm by variance reduction for the setting where just the inner function is smooth. The resulting algorithm requires $$\mathcal {O}(\epsilon ^{-5/2})$$ calls to the inner function value stochastic oracle, $$\mathcal {O}(\epsilon ^{-3/2})$$ calls to the inner function gradient, and $$\mathcal {O}(\epsilon ^{-2})$$ calls to the outer function gradient stochastic oracles. Finally, we numerically evaluate the performance of these three algorithms over two different examples.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Stochastic Composition Optimization of Functions Without Lipschitz Continuous Gradient

Abstract

Talk to us

Similar Papers

More From: Journal of Optimization Theory and Applications

Lead the way for us

Similar Papers

A Single Timescale Stochastic Approximation Method for Nested Stochastic Optimization
Saeed Ghadimi ... Andrzej Ruszczyński
SIAM Journal on Optimization | VOL. 30
Saeed Ghadimi, et. al.Saeed Ghadimi ... Andrzej Ruszczyński
01 Jan 2020
SIAM Journal on Optimization | VOL. 30

S-DIGing: A Stochastic Gradient Tracking Algorithm for Distributed Optimization
Huaqing Li ... Liping Feng
IEEE Transactions on Emerging Topics in Computational Intelligence | VOL. 6
Huaqing Li, et. al.Huaqing Li ... Liping Feng
04 Sep 2020
IEEE Transactions on Emerging Topics in Computational Intelligence | VOL. 6

Stochastic proximal gradient methods for nonconvex problems in Hilbert spaces
Caroline Geiersbach ... Teresa Scarinci
Computational Optimization and Applications | VOL. 78
Caroline Geiersbach, et. al.Caroline Geiersbach ... Teresa Scarinci
12 Jan 2021
Computational Optimization and Applications | VOL. 78

SARAH-M: A fast stochastic recursive gradient descent algorithm via momentum
Zhuang Yang
Expert Systems With Applications | VOL. 238
Zhuang YangZhuang Yang
31 Oct 2023
Expert Systems With Applications | VOL. 238

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Stochastic Composition Optimization of Functions Without Lipschitz Continuous Gradient

Abstract

Talk to us

Similar Papers

More From: Journal of Optimization Theory and Applications