Optimal Learning Rate For Training Time Lagged Recurrent Neural Networks With The Extended Kalman Filter Algorithm

Pu Sun Pu Sun,K Marko

doi:10.1109/ijcnn.1998.685960

Abstract

This paper develops a means to compute an optimal learning rate to improve performance of time lagged recurrent neural networks (TLFWN) trained with the generalized extended Kalman filter (GEKF) algorithm. Previously, an optimal learning rate was introduced for gradient descent training of feedforward networks with the backpropagation algorithm mainly to accelerate training. The focus in this paper is on the performance improvement of neural networks trained by the more powerful Kalman algorithms. We derive a means of determining the optimal learning rate (OLR) on an epochby-epoch basis for GEKF, and describe how the OLR is used in practice. A practical case study in accurate estimation from noisy input data, a mass-air flow virtual sensor problem is presented to illustrate the benefits of this approach. A family of the neural networks of different sizes and different node functions are trained with the standard extended Kalman filter algorithm with and without the optimal learning rate. Typically, GEKF provides performance improvements of more than 20% 100% over simplified Kalman filter methods, which in turn produced better results than any other method we have tried on complex, extensive datsasets. Our new results show that with the OLR, the performance of networks may be further improved over those trained by algorithms with fixed learning rates.

Full Text