In this paper, we consider stochastic composite convex optimization problems with the objective function satisfying a stochastic bounded gradient condition, with or without a quadratic functional growth property. These models include the most well-known classes of objective functions analyzed in the literature: nonsmooth Lipschitz functions and composition of a (potentially) nonsmooth function and a smooth function, with or without strong convexity. Based on the flexibility offered by our optimization model, we consider several variants of stochastic first-order methods, such as the stochastic proximal gradient and the stochastic proximal point algorithms. Usually, the convergence theory for these methods has been derived for simple stochastic optimization models satisfying restrictive assumptions, and the rates are in general sublinear and hold only for specific decreasing stepsizes. Hence, we analyze the convergence rates of stochastic first-order methods with constant or variable stepsize under general assumptions covering a large class of objective functions. For constant stepsize, we show that these methods can achieve linear convergence rate up to a constant proportional to the stepsize and under some strong stochastic bounded gradient condition even pure linear convergence. Moreover, when a variable stepsize is chosen we derive sublinear convergence rates for these stochastic first-order methods. Finally, the stochastic gradient mapping and the Moreau smoothing mapping introduced in the present paper lead to simple and intuitive proofs.
Read full abstract