Comments on "Accelerated learning algorithm for multilayer perceptrons: optimization layer by layer"

B.Ph Van Milligen,C Santa Cruz,J.A Jimenez,V Tribaldos

doi:10.1109/72.661128

Abstract

This letter analyzes the performance of the neural network training method known as optimization layer by layer. We show, from theoretical considerations, that the amount of work required with OLL-Learning scales as the third power of the network size, compared with the square of the network size for commonly used conjugate gradient training algorithms. This theoretical estimate is confirmed through a practical example. Thus, although OLL is shown to function very well for small neural networks (less than about 500 weights per layer), it is slower than CG for large neural networks. Second, we show that OLL does not always improve on the accuracy that can be obtained with CG. It seems that the final accuracy that can be obtained depends strongly on the initial network weights.

Full Text