Spurious Local Minima Research Articles

Study of a simple single-trace transmission example shows how an extended source formulation of full-waveform inversion can produce an optimization problem without spurious local minima (‘cycle skipping’), hence efficiently solvable via Newton-like local optimization methods. The data consist of a single trace extracted from a causal pressure field, propagating in a homogeneous fluid according to linear acoustics, and recorded at a given distance from a transient point energy source. The source intensity (‘wavelet’) is presumed quasi-impulsive, with zero energy for time lags greater than a specified maximum lag. The inverse problem is: from the recorded trace, recover both the sound velocity or slowness and source wavelet with specified support, so that the data is fit with prescribed RMS relative error. The least-squares objective function has multiple large residual minimizers. The extended inverse problem permits source energy to spread in time, and replaces the maximum lag constraint by a weighted quadratic penalty. A companion paper shows that for proper choice of weight operator, any stationary point of the extended objective produces a good approximation of the global minimizer of the least squares objective, with slowness error bounded by a multiple of the maximum lag and the assumed noise level. This paper summarizes the theory developed in the companion paper and presents numerical experiments demonstrating the accuracy of the predictions in concrete instances. We also show how to dynamically adjust the penalty scale during iterative optimization to improve the accuracy of the slowness estimate.

Read full abstract

This paper considers general rank-constrained optimization problems that minimize a general objective function <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${f}( {X})$ </tex-math></inline-formula> over the set of rectangular <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${n}\times {m}$ </tex-math></inline-formula> matrices that have rank at most r. To tackle the rank constraint and also to reduce the computational burden, we factorize <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$ {X}$ </tex-math></inline-formula> into <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$ {U} {V} ^{\mathrm {T}}$ </tex-math></inline-formula> where <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$ {U}$ </tex-math></inline-formula> and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$ {V}$ </tex-math></inline-formula> are <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${n}\times {r}$ </tex-math></inline-formula> and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${m}\times {r}$ </tex-math></inline-formula> matrices, respectively, and then optimize over the small matrices <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$ {U}$ </tex-math></inline-formula> and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$ {V}$ </tex-math></inline-formula> . We characterize the global optimization geometry of the nonconvex factored problem and show that the corresponding objective function satisfies the robust strict saddle property as long as the original objective function f satisfies restricted strong convexity and smoothness properties, ensuring global convergence of many local search algorithms (such as noisy gradient descent) in polynomial time for solving the factored problem. We also provide a comprehensive analysis for the optimization geometry of a matrix factorization problem where we aim to find <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${n}\times {r}$ </tex-math></inline-formula> and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">${m}\times {r}$ </tex-math></inline-formula> matrices <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$ {U}$ </tex-math></inline-formula> and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$ {V}$ </tex-math></inline-formula> such that <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$ {U} {V} ^{\mathrm {T}}$ </tex-math></inline-formula> approximates a given matrix <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$ {X}^\star $ </tex-math></inline-formula> . Aside from the robust strict saddle property, we show that the objective function of the matrix factorization problem has no spurious local minima and obeys the strict saddle property not only for the exact-parameterization case where <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathrm {rank}( {X}^\star) = {r}$ </tex-math></inline-formula> , but also for the over-parameterization case where <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathrm {rank}( {X}^\star) < {r}$ </tex-math></inline-formula> and the under-parameterization case where <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathrm {rank}( {X}^\star) > {r}$ </tex-math></inline-formula> . These geometric properties imply that a number of iterative optimization algorithms (such as gradient descent) converge to a global solution with random initialization.

Read full abstract

Spurious Local Minima Research Articles

Related Topics

Articles published on Spurious Local Minima

Spurious Local Minima are Common for Deep Neural Networks With Piecewise Linear Activations.

Metalearning-Based Alternating Minimization Algorithm for Nonconvex Optimization.

Low-Rank Univariate Sum of Squares Has No Spurious Local Minima

Theory of overparametrization in quantum neural networks.

A Feasible Method for Solving an SDP Relaxation of the Quadratic Knapsack Problem

Time-Variation in Online Nonconvex Optimization Enables Escaping From Spurious Local Minima

Solution of an acoustic transmission inverse problem by extended inversion

Local and Global Convergence of General Burer-Monteiro Tensor Optimizations

Optimization on the Euclidean Unit Sphere

Homogeneous polynomials and spurious local minima on the unit sphere

On the geometric analysis of a quartic–quadratic optimization problem under a spherical constraint

An unbiased approach to compressed sensing

The Global Optimization Geometry of Low-Rank Matrix Optimization

The Global Geometry of Centralized and Distributed Low-rank Matrix Recovery Without Regularization

On the Global Geometry of Sphere-Constrained Sparse Blind Deconvolution.

Spurious Local Minima in Power System State Estimation

The Global Optimization Geometry of Shallow Linear Neural Networks

A machine-learning approach for noise reduction in parameter estimation inverse problems, applied to characterization of oil reservoirs

Global Optimality in Low-Rank Matrix Optimization

An exact penalty function based on the projection matrix

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Spurious Local Minima Research Articles

Related Topics

Articles published on Spurious Local Minima

Spurious Local Minima are Common for Deep Neural Networks With Piecewise Linear Activations.

Metalearning-Based Alternating Minimization Algorithm for Nonconvex Optimization.

Low-Rank Univariate Sum of Squares Has No Spurious Local Minima

Theory of overparametrization in quantum neural networks.

A Feasible Method for Solving an SDP Relaxation of the Quadratic Knapsack Problem

Time-Variation in Online Nonconvex Optimization Enables Escaping From Spurious Local Minima

Solution of an acoustic transmission inverse problem by extended inversion

Local and Global Convergence of General Burer-Monteiro Tensor Optimizations

Optimization on the Euclidean Unit Sphere

Homogeneous polynomials and spurious local minima on the unit sphere

On the geometric analysis of a quartic–quadratic optimization problem under a spherical constraint

An unbiased approach to compressed sensing

The Global Optimization Geometry of Low-Rank Matrix Optimization

The Global Geometry of Centralized and Distributed Low-rank Matrix Recovery Without Regularization

On the Global Geometry of Sphere-Constrained Sparse Blind Deconvolution.

Spurious Local Minima in Power System State Estimation

The Global Optimization Geometry of Shallow Linear Neural Networks

A machine-learning approach for noise reduction in parameter estimation inverse problems, applied to characterization of oil reservoirs

Global Optimality in Low-Rank Matrix Optimization

An exact penalty function based on the projection matrix