Lipschitz Continuous Gradient Research Articles

Most first-order methods rely on the global Lipschitz continuity of the objective gradient, which fails to hold in many problems. This paper develops a sequential local optimization (SLO) framework for first-order algorithms to optimize problems without Lipschitz gradient. Operating on the assumption that the gradient is locally Lipschitz continuous over any compact set, SLO develops a careful scheme to control the distance between successive iterates. The proposed framework can easily adapt to the existing first-order methods, such as projected gradient descent (PGD), truncated gradient descent (TGD), and a parameter-free variant of Armijo linesearch. We show that SLO requires [Formula: see text] gradient evaluations to find an ϵ-stationary point, where Y is certain compact set with [Formula: see text] radius, and [Formula: see text] denotes the Lipschitz constant of the i-th order derivatives in Y. It is worth noting that our analysis provides the first nonasymptotic convergence rate for the (slight variant of) Armijo linesearch algorithm without globally Lipschitz continuous gradient or convexity. As a generic framework, we also show that SLO can incorporate more complicated subroutines, such as a variant of the accelerated gradient descent (AGD) method that can harness the problem’s second-order smoothness without Hessian computation, which achieves an improved [Formula: see text] complexity. Funding: J. Zhang is supported by the MOE AcRF [Grant A-0009530-04-00], from Singapore Ministry of Education. M. Hong is supported by NSF [Grants CIF-1910385 and EPCN-2311007]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/ijoo.2021.0029 .

Read full abstract

Sparsity finds applications in diverse areas such as statistics, machine learning, and signal processing. Computations over sparse structures are less complex compared to their dense counterparts and need less storage. This paper proposes a heuristic method for retrieving sparse approximate solutions of optimization problems via minimizing the ℓp\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\ell _{p}$$\\end{document} quasi-norm, where 0<p<1\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$0<p<1$$\\end{document}. An iterative two-block algorithm for minimizing the ℓp\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\ell _{p}$$\\end{document} quasi-norm subject to convex constraints is proposed. The proposed algorithm requires solving for the roots of a scalar degree polynomial as opposed to applying a soft thresholding operator in the case of ℓ1\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\ell _{1}$$\\end{document} norm minimization. The algorithm’s merit relies on its ability to solve the ℓp\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\ell _{p}$$\\end{document} quasi-norm minimization subject to any convex constraints set. For the specific case of constraints defined by differentiable functions with Lipschitz continuous gradient, a second, faster algorithm is proposed. Using a proximal gradient step, we mitigate the convex projection step and hence enhance the algorithm’s speed while proving its convergence. We present various applications where the proposed algorithm excels, namely, sparse signal reconstruction, system identification, and matrix completion. The results demonstrate the significant gains obtained by the proposed algorithm compared to other ℓp\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\ell _{p}$$\\end{document} quasi-norm based methods presented in previous literature.

Read full abstract

Lipschitz Continuous Gradient Research Articles

Articles published on Lipschitz Continuous Gradient

On generalized Jacobians in the sense of Clarke for the inverse of a bi‐Lipschitz map and applications in relaxation theory

SBL-LCGL: sparse Bayesian learning based on Laplace distribution for robust cone-beam x-ray luminescence computed tomography

Universal heavy-ball method for nonconvex optimization under Hölder continuous Hessians

Efficiency of Stochastic Coordinate Proximal Gradient Methods on Nonseparable Composite Optimization

Convergence Rate of the (1+1)-ES on Locally Strongly Convex and Lipschitz Smooth Functions

Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates

A Bregman Proximal Stochastic Gradient Method with Extrapolation for Nonconvex Nonsmooth Problems

Trust Region Methods for Nonconvex Stochastic Optimization beyond Lipschitz Smoothness

First-Order Algorithms Without Lipschitz Gradient: A Sequential Local Optimization Approach

Lp quasi-norm minimization: algorithm and applications

Lipschitz continuity of the metric projection operator and convergence of gradient methods

Communication Efficient Curvature Aided Primal-Dual Algorithms for Decentralized Optimization

Accelerated First-Order Methods for Convex Optimization with Locally Lipschitz Continuous Gradient

An accelerated exact distributed first-order algorithm for optimization over directed networks

A proximal subgradient algorithm with extrapolation for structured nonconvex nonsmooth problems

Time Rescaling of a Primal-Dual Dynamical System with Asymptotically Vanishing Damping

A Bregman stochastic method for nonconvex nonsmooth problem beyond global Lipschitz gradient continuity

The Proxy Step-Size Technique for Regularized Optimization on the Sphere Manifold.

Global Convergence of Policy Gradient Primal–Dual Methods for Risk-Constrained LQRs

Stochastic Composition Optimization of Functions Without Lipschitz Continuous Gradient

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Lipschitz Continuous Gradient Research Articles

Articles published on Lipschitz Continuous Gradient

On generalized Jacobians in the sense of Clarke for the inverse of a bi‐Lipschitz map and applications in relaxation theory

SBL-LCGL: sparse Bayesian learning based on Laplace distribution for robust cone-beam x-ray luminescence computed tomography

Universal heavy-ball method for nonconvex optimization under Hölder continuous Hessians

Efficiency of Stochastic Coordinate Proximal Gradient Methods on Nonseparable Composite Optimization

Convergence Rate of the (1+1)-ES on Locally Strongly Convex and Lipschitz Smooth Functions

Incremental Quasi-Newton Methods with Faster Superlinear Convergence Rates

A Bregman Proximal Stochastic Gradient Method with Extrapolation for Nonconvex Nonsmooth Problems

Trust Region Methods for Nonconvex Stochastic Optimization beyond Lipschitz Smoothness

First-Order Algorithms Without Lipschitz Gradient: A Sequential Local Optimization Approach

Lp quasi-norm minimization: algorithm and applications

Lipschitz continuity of the metric projection operator and convergence of gradient methods

Communication Efficient Curvature Aided Primal-Dual Algorithms for Decentralized Optimization

Accelerated First-Order Methods for Convex Optimization with Locally Lipschitz Continuous Gradient

An accelerated exact distributed first-order algorithm for optimization over directed networks

A proximal subgradient algorithm with extrapolation for structured nonconvex nonsmooth problems

Time Rescaling of a Primal-Dual Dynamical System with Asymptotically Vanishing Damping

A Bregman stochastic method for nonconvex nonsmooth problem beyond global Lipschitz gradient continuity

The Proxy Step-Size Technique for Regularized Optimization on the Sphere Manifold.

Global Convergence of Policy Gradient Primal–Dual Methods for Risk-Constrained LQRs

Stochastic Composition Optimization of Functions Without Lipschitz Continuous Gradient