Linear function neurons: structure and training.

S E Hampson,D J Volper

doi:10.1007/bf00336991

Abstract

Three different representations for a thresholded linear equation are developed. For binary input they are shown to be representationally equivalent though their training characteristics differ. A training algorithm for linear equations is discussed. The similarities between its simplest mathematical representation (perceptron training), a formal model of animal learning (Rescorla-Wagner learning), and one mechanism of neural learning (Aplysia gill withdrawal) are pointed out. For d input features, perceptron training is shown to have a lower bound of 2d and an upper bound of dd adjusts. It is possible that the true upper bound is 4d, though this has not been proved. Average performance is shown to have a lower bound of 1.4d. Learning time is shown to increase linearly with the number of irrelevant or replicated features. The (X of N) function (a subset of linearly separable functions containing OR and AND) is shown to be learnable in d3 time. A method of utilizing conditional probability to accelerate learning is proposed. This reduces the observed growth rate from 4d to the theoretical minimum (for unmodified version) of 2d. A different version reduces the growth rate to about 1.7d. The linear effect of irrelevant features can also be eliminated. Whether such an approach can be made probably convergent is not known.

Full Text