An accelerated Levenberg-Marquardt algorithm for feedforward network

Young-Tae Kwak

doi:10.7465/jkdi.2012.23.5.1027

Abstract

Abstract This paper proposes a new Levenberg-Marquardt algorithm that is accelerated byadjusting a Jacobian matrix and a quasi-Hessian matrix. The proposed method parti-tions the Jacobian matrix into block matrices and employs the inverse of a partitionedmatrix to nd the inverse of the quasi-Hessian matrix. Our method can avoid expen-sive operations and save memory in calculating the inverse of the quasi-Hessian matrix.It can shorten the training time for fast convergence. In our results tested in a largeapplication, we were able to save about 20% of the training time than other algorithms.Keywords: Error backpropagation, Levenberg-Marquardt algorithm, multilayer percep-trons. 1. Introduction The multilayer perceptron (MLP) has been used often in many applications since it wasdeveloped. To train the MLP, some have used the error backpropagation (EBP) algorithmbecause it is easy and simple (Lippmann, 1987; Na and Kwon, 2010; Oh et al., 2011).However, this algorithm has slow convergence due to a gradient descent method. Manyattempts have been made to overcome that drawback (Oh and Lee, 1995; Vogl et al., 1988;Yu et al., 1995). However, EBP still su ers from a slow training speed (Buntine, 1994).As an alternative to EBP, many studies have used a second-order method such as theconjugate gradient method (Charalambous, 1992), quasi-Newton method (Setiono and Hui,1995), Gauss-Newton method or Levenberg-Marquardt (LM) algorithm (Hagan and Menhaj,1994; Wilamowski and Yu, 2010). These methods are designed to be faster than EBP. TheLM algorithm is estimated to be much faster than the other algorithms if the MLP is notvery large. This algorithm is being used as the default in the Matlab Toolbox because of itsfast convergence.However, the LM algorithm must build a Jacobian matrix and calculate the inverse of aquasi-Hessian matrix. The two matrices are a large burden for the LM algorithm. To solvesuch problems, Lera and Pinzolas (2002) trained the local nodes of the MLP to save bothmemory and expensive matrix operations. Wilamowski and Yu took the quasi-Hessian matrixdirectly from the gradient vector of each pattern, without Jacobian matrix multiplication

Full Text