Regularized black-box optimization algorithms for least-squares problems

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Abstract We consider the problem of optimizing the sum of a smooth, nonconvex function for which derivatives are unavailable and a convex, nonsmooth function with an easy-to-evaluate proximal operator. Of particular focus is the case where the smooth part has a nonlinear least-squares (LS) structure. We adapt two existing approaches for derivative-free optimization (DFO) of nonsmooth compositions of smooth functions to this setting. Our main contribution is adapting our algorithm to handle inexactly computed stationarity measures, where the inexactness is adaptively adjusted as required by the algorithm (where previous approaches assumed access to exact stationary measures, which is not realistic in this setting). Numerically, we provide two extensions of the state-of-the-art DFO-LS solver for nonlinear least-squares problems and demonstrate their strong practical performance.

Similar Papers
  • Research Article
  • Cite Count Icon 27744
  • 10.1137/0111030
An Algorithm for Least-Squares Estimation of Nonlinear Parameters
  • Jun 1, 1963
  • Journal of the Society for Industrial and Applied Mathematics
  • Donald W Marquardt

An Algorithm for Least-Squares Estimation of Nonlinear Parameters

  • Research Article
  • Cite Count Icon 112
  • 10.1137/0705057
On Solving Nonlinear Equations with a One-Parameter Operator Imbedding
  • Dec 1, 1968
  • SIAM Journal on Numerical Analysis
  • Gunter H Meyer

One parameter operator imbedding to modify Newton method for solution of nonlinear equations

  • Book Chapter
  • Cite Count Icon 10
  • 10.1007/bfb0120068
A new sufficient condition for the well-posedness of non-linear least square problems arising in identification and control
  • Jan 1, 1990
  • Guy Chavent

We show how simple 1-D geometrical calculations (but along all maximal segments of the parameter or control setl) can be used to establish the weliposedness of a non-linear leastsquare (NLLS) problem and the absence of local minima in the corresponding error function. These sufficient conditions, which are shown to be sharp by elementary examples, are based on the use of the recently developed “size x curvature”, conditions for proving that the output set is strictly quasiconvex. The use of this geometrical theory as a numerical or theoretical tool is discussed. Finally, application to regularized NLLS problem is shown to give new information on the choice of the regularizing parameter.

  • Research Article
  • Cite Count Icon 51
  • 10.1017/s0962492904000169
The calculation of linear least squares problems
  • May 1, 2004
  • Acta Numerica
  • Åke Björck

We first survey componentwise and normwise perturbation bounds for the standard least squares (LS) and minimum norm problems. Then some recent estimates of the optimal backward error for an alleged solution to an LS problem are presented. These results are particularly interesting when the algorithm used is not backward stable.The QR factorization and the singular value decomposition (SVD), developed in the 1960s and early 1970s, remain the basic tools for solving both the LS and the total least squares (TLS) problems. Current algorithms based on Householder or Gram-Schmidt QR factorizations are reviewed. The use of the SVD to determine the numerical rank of a matrix, as well as for computing a sequence of regularized solutions, is then discussed. The solution of the TLS problem in terms of the SVD of the compound matrix $(b\ A)$ is described.Some recent algorithmic developments are motivated by the need for the efficient implementation of the QR factorization on modern computer architectures. This includes blocked algorithms as well as newer recursive implementations. Other developments come from needs in different application areas. For example, in signal processing rank-revealing orthogonal decompositions need to be frequently updated. We review several classes of such decompositions, which can be more efficiently updated than the SVD.Two algorithms for the orthogonal bidiagonalization of an arbitrary matrix were given by Golub and Kahan in 1965, one using Householder transformations and the other a Lanczos process. If used to transform the matrix $(b\ A)$ to upper bidiagonal form, this becomes a powerful tool for solving various LS and TLS problems. This bidiagonal decomposition gives a core regular subproblem for the TLS problem. When implemented by the Lanczos process it forms the kernel in the iterative method LSQR. It is also the basis of the partial least squares (PLS) method, which has become a standard tool in statistics.We present some generalized QR factorizations which can be used to solve different generalized least squares problems. Many applications lead to LS problems where the solution is subject to constraints. This includes linear equality and inequality constraints. Quadratic constraints are used to regularize solutions to discrete ill-posed LS problems. We survey these classes of problems and discuss their solution.As in all scientific computing, there is a trend that the size and complexity of the problems being solved is steadily growing. Large problems are often sparse or structured. Algorithms for the efficient solution of banded and block-angular LS problems are given, followed by a brief discussion of the general sparse case. Iterative methods are attractive, in particular when matrix-vector multiplication is cheap.

  • Research Article
  • Cite Count Icon 185
  • 10.1007/bf00939613
Circle fitting by linear and nonlinear least squares
  • Feb 1, 1993
  • Journal of Optimization Theory and Applications
  • I D Coope

The problem of determining the circle of best fit to a set of points in the plane (or the obvious generalization ton-dimensions) is easily formulated as a nonlinear total least-squares problem which may be solved using a Gauss-Newton minimization algorithm. This straight-forward approach is shown to be inefficient and extremely sensitive to the presence of outliers. An alternative formulation allows the problem to be reduced to a linear least squares problem which is trivially solved. The recommended approach is shown to have the added advantage of being much less sensitive to outliers than the nonlinear least squares approach.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/.2001.980445
Model-based estimation of cylinder pressure sensor offset using least-squares methods
  • Dec 4, 2001
  • P Tunestal + 2 more

Two methods for estimating the sensor offset of a cylinder pressure transducer are developed. Both methods fit the pressure data during pre-combustion compression to a polytropic curve. The first method assumes a known polytropic exponent, and the other estimates the polytropic exponent. The first method results in a linear least-squares problem, and the second method results in a nonlinear least-squares problem. The nonlinear least-squares problem is solved by separating out the nonlinear dependence and solving the single-variable minimization problem. For this, a finite difference Newton method is applied. Using this method, the cost of solving the nonlinear least-squares problem is only slightly higher than solving the linear least-squares problem. Both methods show good statistical behavior. Estimation error variances are inversely proportional to the number of pressure samples used for the estimation. The method is computationally inexpensive, and well suited for real-time control applications.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 30
  • 10.1007/s10107-022-01836-1
Scalable subspace methods for derivative-free nonlinear least-squares optimization
  • Jun 9, 2022
  • Mathematical Programming
  • Coralia Cartis + 1 more

We introduce a general framework for large-scale model-based derivative-free optimization based on iterative minimization within random subspaces. We present a probabilistic worst-case complexity analysis for our method, where in particular we prove high-probability bounds on the number of iterations before a given optimality is achieved. This framework is specialized to nonlinear least-squares problems, with a model-based framework based on the Gauss–Newton method. This method achieves scalability by constructing local linear interpolation models to approximate the Jacobian, and computes new steps at each iteration in a subspace with user-determined dimension. We then describe a practical implementation of this framework, which we call DFBGN. We outline efficient techniques for selecting the interpolation points and search subspace, yielding an implementation that has a low per-iteration linear algebra cost (linear in the problem dimension) while also achieving fast objective decrease as measured by evaluations. Extensive numerical results demonstrate that DFBGN has improved scalability, yielding strong performance on large-scale nonlinear least-squares problems.

  • Research Article
  • Cite Count Icon 6
  • 10.1007/bf02242173
On solving nonlinear least-squares problems in case of rankdeficient Jacobians
  • Mar 1, 1985
  • Computing
  • R Menzel

This paper is concerned with solving nonlinear least-squares problemsg(x):=1/2F(x) T F(x)→Min!,F:ℝ n ℝ m ,m≧n having at a solution pointx * the rankdeficient JacobianF′(x*). An auxiliary least-squares problemG(u)→Min!,u≔(x, d) T of higher dimension is constructed which can be shown to be a well-posed one if the rank deficiencyr ofF′(x*) is small. Moreover, it is proved that for arbitraryr in the consistent caseg(x*)=0 the Gauss-Newton sequence {x k} converges at leastQ-linearly tox *. Gegenstand der Arbeit ist die Losung nichtlinearer Quadratmittelproblemeg(x):=1/2F(x) T F(x)→Min!,F:ℝ n ℝ m ,m≧n, fur die die Jacobi-MatrixF′(x*) in einem Losungspunktx * rangdefizient ist. Es wird ein HilfsquadratmittelproblemG(u)→Min!,u≔(x,d) T, hoherer Dimension konstruiert, von dem Gutartigkeit gezeigt werden kann, falls die Rangdefizienzr vonF′(x*) klein ist. Uberdies wird bewiesen, das fur beliebigesr im konsistenten Fallg(x*)=0 die Gauss-Newton-Folge {x k} mindestensQ-linear gegenx * konvergiert.

  • Research Article
  • 10.11948/2019.57
KRYLOV SUBSPACE METHODS WITH DEFLATION AND BALANCING PRECONDITIONERS FOR LEAST SQUARES PROBLEMS
  • Jan 1, 2019
  • Journal of Applied Analysis & Computation
  • Liang Zhao + 2 more

For solving least squares problems, the CGLS method is a typical method in the point of view of iterative methods. When the least squares problems are ill-conditioned, the convergence behavior of the CGLS method will present a deteriorated result. We expect to select other iterative Krylov subspace methods to overcome the disadvantage of CGLS. Here the GMRES method is a suitable algorithm for the reason that it is derived from the minimal residual norm approach, which coincides with least squares problems. Ken Hayami proposed BAGMRES for solving least squares problems in [GMRES Methods for Least Squares Problems, SIAM J. Matrix Anal. Appl., 31(2010), pp.2400-2430]. The deflation and balancing preconditioners can optimize the convergence rate through modulating spectral distribution. Hence, in this paper we utilize preconditioned iterative Krylov subspace methods with deflation and balancing preconditioners in order to solve ill-conditioned least squares problems. Numerical experiments show that the methods proposed in this paper are better than the CGLS method.

  • Research Article
  • Cite Count Icon 46
  • 10.1007/bf01396223
Algebraic relations between the total least squares and least squares problems with more than one solution
  • Dec 1, 1992
  • Numerische Mathematik
  • Musheng Wei

This paper completes our previous discussion on the total least squares (TLS) and the least squares (LS) problems for the linear systemAX=B which may contain more than one solution [12, 13], generalizes the work of Golub and Van Loan [1,2], Van Huffel [8], Van Huffel and Vandewalle [11]. The TLS problem is extended to the more general case. The sets of the solutions and the squared residuals for the TLS and LS problems are compared. The concept of the weighted squares residuals is extended and the difference between the TLS and the LS approaches is derived. The connection between the approximate subspaces and the perturbation theories are studied. It is proved that under moderate conditions, all the corresponding quantities for the solution sets of the TLS and the modified LS problems are close to each other, while the quantities for the solution set of the LS problem are close to the corresponding ones of a subset of that of the TLS problem.

  • Book Chapter
  • 10.1007/11559887_18
A Comparison of Condition Numbers for the Full Rank Least Squares Problem
  • Jan 1, 2005
  • Joab R Winkler

Condition numbers of the full rank least squares (LS) problem minx||Ax−b||2 are considered theoretically and their computational implementation is compared. These condition numbers range from a simple normwise measure that may overestimate by several orders of magnitude the true numerical condition of the LS problem, to refined componentwise and normwise measures. Inequalities that relate these condition numbers are established, and it is concluded that the solution x0 of the LS problem may be well-conditioned in the normwise sense, even if one of its components is ill-conditioned. It is shown that the refined condition numbers are ill-conditioned in some circumstances, the cause of this ill-conditioning is identified, and its implications are discussed.

  • Research Article
  • Cite Count Icon 46
  • 10.1109/tac.2018.2838045
A Regularized Variable Projection Algorithm for Separable Nonlinear Least Squares Problems
  • Jan 1, 2018
  • IEEE Transactions on Automatic Control
  • Guang-Yong Chen + 3 more

Separable nonlinear least-squares (SNLLS) problems arise frequently in many research fields, such as system identification and machine learning. The variable projection (VP) method is a very powerful tool for solving such problems. In this paper, we consider the regularization of ill-conditioned SNLLS problems based on the VP method. Selecting an appropriate regularization parameter is difficult because of the nonlinear optimization procedure. We propose to determine the regularization parameter using the weighted generalized cross-validation method at every iteration. This makes the original objective function changing during the optimization procedure. To circumvent this problem, we use an inequation to produce a consistent demand of decreasing at successive iterations. The approximation of the Jacobian of the regularized problem is also discussed. The proposed regularized VP algorithm is tested by the parameter estimation problem of several statistical models. Numerical results demonstrate the effectiveness of the proposed algorithm.

  • Research Article
  • Cite Count Icon 7
  • 10.1016/j.heliyon.2021.e07499
Alternative structured spectral gradient algorithms for solving nonlinear least-squares problems
  • Jul 1, 2021
  • Heliyon
  • Mahmoud Muhammad Yahaya + 3 more

The study of efficient iterative algorithms for addressing nonlinear least-squares (NLS) problems is of great importance. The NLS problems, which belong to a special class of unconstrained optimization problems, are of particular interest because of the special structure of their gradients and Hessians. In this paper, based on the spectral parameters of Barzillai and Borwein (1998), we propose three structured spectral gradient algorithms for solving NLS problems. Each spectral parameter in the respective algorithms incorporates the structured gradient and the information gained from the structured Hessian approximation. Moreover, we develop a safeguarding technique for the first two structured spectral parameters to avoid negative curvature directions. Moreso, using a nonmonotone line-search strategy, we show that the proposed algorithms are globally convergent under some standard conditions. The comparative computational results on some standard test problems show that the proposed algorithms are efficient.

  • Research Article
  • Cite Count Icon 34
  • 10.1007/s10107-018-1327-8
A successive difference-of-convex approximation method for a class of nonconvex nonsmooth optimization problems
  • Sep 8, 2018
  • Mathematical Programming
  • Tianxiang Liu + 2 more

We consider a class of nonconvex nonsmooth optimization problems whose objective is the sum of a smooth function and a finite number of nonnegative proper closed possibly nonsmooth functions (whose proximal mappings are easy to compute), some of which are further composed with linear maps. This kind of problems arises naturally in various applications when different regularizers are introduced for inducing simultaneous structures in the solutions. Solving these problems, however, can be challenging because of the coupled nonsmooth functions: the corresponding proximal mapping can be hard to compute so that standard first-order methods such as the proximal gradient algorithm cannot be applied efficiently. In this paper, we propose a successive difference-of-convex approximation method for solving this kind of problems. In this algorithm, we approximate the nonsmooth functions by their Moreau envelopes in each iteration. Making use of the simple observation that Moreau envelopes of nonnegative proper closed functions are continuous difference-of-convex functions, we can then approximately minimize the approximation function by first-order methods with suitable majorization techniques. These first-order methods can be implemented efficiently thanks to the fact that the proximal mapping of each nonsmooth function is easy to compute. Under suitable assumptions, we prove that the sequence generated by our method is bounded and any accumulation point is a stationary point of the objective. We also discuss how our method can be applied to concrete applications such as nonconvex fused regularized optimization problems and simultaneously structured matrix optimization problems, and illustrate the performance numerically for these two specific applications.

  • Research Article
  • 10.5555/3029652.3029656
Computational Behavior of Gauss–Newton Methods
  • May 1, 1989
  • SIAM Journal on Scientific and Statistical Computing
  • Fraleyc

This paper is concerned with the numerical behavior of Gauss–Newton methods for nonlinear least-squares problems. It is well known that Gauss–Newton methods often cannot be applied successfully wit...

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.