Fast parallel off-line training of multilayer perceptrons

S Mcloone,G.W Irwin

doi:10.1109/72.572103

Abstract

Various approaches to the parallel implementation of second-order gradient-based multilayer perceptron training algorithms are described. Two main classes of algorithm are defined involving Hessian and conjugate gradient-based methods. The limited- and full-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithms are selected as representative examples and used to show that the step size and gradient calculations are critical components. For larger problems the matrix calculations in the full-memory algorithm are also significant. Various strategies are considered for parallelization, the best of which is implemented on parallel virtual machine (PVM) and transputer-based architectures. Results from a range of problems are used to demonstrate the performance achievable with each architecture. The transputer implementation is found to give excellent speed-ups but the problem size is limited by memory constraints. The speed-ups achievable with the PVM implementation are much poorer because of inefficient communication, but memory is not a difficulty.

Full Text