Methods are presented for least squares data smoothing by using the signs of divided differences of the smoothed values. Professor M.J.D. Powell initiated the subject in the early 1980s and since then, theory, algorithms and FORTRAN software make it applicable to several disciplines in various ways. Let us consider n data measurements of a univariate function which have been altered by random errors. Then it is usual for the divided differences of the measurements to show sign alterations, which are probably due to data errors. We make the least sum of squares change to the measurements, by requiring the sequence of divided differences of order m to have at most q sign changes for some prescribed integer q. The positions of the sign changes are integer variables of the optimization calculation, which implies a combinatorial problem whose solution can require about O( n q ) quadratic programming calculations in n variables and n− m constraints. Suitable methods have been developed for the following cases. It has been found that a dynamic programming procedure can calculate the global minimum for the important cases of piecewise monotonicity m=1, q⩾1 and piecewise convexity/concavity m=2, q⩾1 of the smoothed values. The complexity of the procedure in the case of m=1 is O(n 2+qn log 2 n) computer operations, while it is reduced to only O( n) when q=0 (monotonicity) and q=1 (increasing/decreasing monotonicity). The case m=2, q⩾1 requires O( qn 2) computer operations and n 2 quadratic programming calculations, which is reduced to one and n−2 quadratic programming calculations when m=2, q=0, i.e. convexity, and m=2, q=1, i.e. convexity/concavity, respectively. Unfortunately, the technique that receives this efficiency cannot generalize for the highly nonlinear case m⩾3, q⩾2. However, the case m⩾3, q=0 is solved by a special strictly convex quadratic programming calculation, and the case m⩾3, q=1 can be solved by at most 2( n− m) applications of the previous algorithm. Also, as m gets larger, large sets of active constraints usually occur at the optimal approximation, which makes the calculation for higher values of q less expensive than what it seems to. Further, attention is given to the sensitivity of the solution with respect to changes in the constraints and the data. The smoothing technique is an active research topic and there seems to be room for further developments. One strong reason for studying methods that make use of divided differences for data smoothing is that, whereas only the data are provided, the achieved accuracy goes much beyond the accuracy of the data at an order determined automatically by the chosen divided differences.
Read full abstract