We discuss a certain special subset of Lagrange multipliers, called critical, which usually exist when multipliers associated to a given solution are not unique. This kind of multipliers appear to be important for a number of reasons, some understood better, some (currently) not fully. What is clear, is that Newton and Newton-related methods have an amazingly strong tendency to generate sequences with dual components converging to critical multipliers. This is quite striking because, typically, the set of critical multipliers is “thin” (the set of noncritical ones is relatively open and dense, meaning that its closure is the whole set). Apart from mathematical curiosity to understand the phenomenon for something as classical as the Newton method, the attraction to critical multipliers is relevant computationally. This is because convergence to such multipliers is the reason for slow convergence of the Newton method in degenerate cases, as convergence to noncritical limits (if it were to happen) would have given the superlinear rate. Moreover, the attraction phenomenon shows up not only for the basic Newton method, but also for other related techniques (for example, quasi-Newton, and the linearly constrained augmented Lagrangian method). Despite clear computational evidence, proving that convergence to a critical limit must occur appears to be a challenge, at least for general problems. We outline the partial results obtained up to now. We also discuss the important role that noncritical multipliers play for stability, sensitivity, and error bounds. Finally, an important issue is dual stabilization, i.e., techniques to avoid moving along the multiplier set towards a critical one (since it leads to slow convergence). We discuss the algorithms that do the job locally, i.e., when initialized close enough to a noncritical multiplier, their dual behavior is as desired. These include the stabilized sequential quadratic programming method and the augmented Lagrangian algorithm. However, when the starting point is far, even those algorithms do not appear to provide fully satisfactory remedies. We discuss the challenges with constructing good algorithms for the degenerate case, which have to incorporate dual stabilization for fast local convergence, at an acceptable computational cost, and also be globally efficient.
Read full abstract