Three Methods to Speed up the Training of Feedforward and Feedback Perceptrons

Fritz Stäger,Mukul Agarwal

doi:10.1016/s0893-6080(97)00053-1

Abstract

Training of artificial neural networks is normally a time consuming task due to iterative search imposed by the implicit nonlinearity of the network behaviour. In this work, three improvements to “batch-mode” offline training methods, gradient-based or gradient-free, are proposed. For nonlinear multilayer perceptrons (NMLP) with linear output layers, a method based on linear regression in the output layer is presented. For arbitrary NMLPs, an algorithm is developed for detecting “saturated” hidden nodes and re-activating them while transferring their contribution onto the bias node in the same layer. For state-feedback NMLPs with incomplete learning data in the state variables, a method is shown that interpolates the unknown state values to form an intermediate training set used for finding good initial weights for the final training with only the original training set. In addition, three conventional gradient-based training methods—steepest-descent gradient search, conjugate gradient, and Gauss–Newton—are compared mutually and with the above improvements on the same two example problems. Where conventional methods get stuck in bad local minima, saturation avoidance leads to satisfactory results, and the speed-up achieved by the two other improvements is about two orders of magnitude.

Full Text