Abstract

The tangent plane algorithm is a fast sequential learning method for multilayered feedforward neural networks that accepts almost zero initial conditions for the connection weights with the expectation that only the minimum number of weights will be activated. However, the inclusion of a tendency to move away from the origin in weight space can lead to large weights that are harmful to generalization. This paper evaluates two techniques used to limit the size of the weights, weight growing and weight elimination, in the tangent plane algorithm. Comparative tests were carried out using the Extreme Learning Machine which is a fast global minimiser giving good generalization. Experimental results show that the generalization performance of the tangent plane algorithm with weight elimination is at least as good as the ELM algorithm making it a suitable alternative for problems that involve time varying data such as EEG and ECG signals.

Highlights

  • In Lee [1] an algorithm was described for supervised training in multilayered feedforward neural networks giving faster convergence and improved generalization relative to the gradient descent backpropagation algorithm

  • A directional movement vector is introduced into the training process to push the movement in weight space towards the origin

  • The ability of the new improved tangent plane algorithm (iTPA) and original tangent plane algorithms to generalise from a given set of training data was evaluated and compared with the Extreme Learning

Read more

Summary

Introduction

In Lee [1] an algorithm was described for supervised training in multilayered feedforward neural networks giving faster convergence and improved generalization relative to the gradient descent backpropagation algorithm. This tangent plane algorithm starts the training with the connection weights set to values close to zero in the expectation that the minimum weights necessary will be activated. According to Bartlett [2], the size of the weights is more important than the number of weights in determining good generalization This poses the following question: can we modify this algorithm so that it discourages the formation of weights with large values? This poses the following question: can we modify this algorithm so that it discourages the formation of weights with large values? Further, can the algorithm encourage weights with small values to decay rapidly to zero producing a network having the optimum size for good generalization?

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call