Abstract
This paper presents a more detailed analysis of a class of minimization algorithms, which includes as a special case the DFP (Davidon-Fletcher-Powell) method, than has previously appeared. Only quadratic functions are considered but particular attention is paid to the magnitude of successive errors and their dependence upon the initial matrix. On the basis of this a possible explanation of some of the observed characteristics of the class is tentatively suggested. PROBABLY the best-known algorithm for determining the unconstrained minimum of a function of many variables, where explicit expressions are available for the first partial derivatives, is that of Davidon (1959) as modified by Fletcher & Powell (1963). This algorithm has many virtues. It is simple and does not require at any stage the solution of linear equations. It minimizes a quadratic function exactly in a finite number of steps and this property makes convergence of this algorithm rapid, when applied to more general functions, in the neighbourhood of the solution. It is, at least in theory, stable since the iteration matrix H,, which transforms the jth gradient into the /th step direction, may be shown to be positive definite. In practice the algorithm has been generally successful, but it has exhibited some puzzling behaviour. Broyden (1967) noted that H, does not always remain positive definite, and attributed this to rounding errors. Pearson (1968) found that for some problems the solution was obtained more efficiently if H, was reset to a positive definite matrix, often the unit matrix, at intervals during the computation. Bard (1968) noted that H, could become singular, attributed this to rounding error and suggested the use of suitably chosen scaling factors as a remedy. In this paper we analyse the more general algorithm given by Broyden (1967), of which the DFP algorithm is a special case, and determine how for quadratic functions the choice of an arbitrary parameter affects convergence. We investigate how the successive errors depend, again for quadratic functions, upon the initial choice of iteration matrix paying particular attention to the cases where this is either the unit matrix or a good approximation to the inverse Hessian. We finally give a tentative explanation of some of the observed experimental behaviour in the case where the function to be minimized is not quadratic.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.