Abstract

Conventionally, the square error (SE) and/or the relative entropy (RE) error are used as a cost function to be minimized in training neural networks via optimization algorithms. While the aforesaid error measures are deduced directly from the parameter values (such as the output and the teacher values of the network), an alternative approach is to elucidate an error measure from the information (or negentropy) content associated with such parameters. That is, a cost-function-based optimization can be specified in the information-theoretic plane in terms of generalized maximum and/or minimum entropy considerations associated with the network. A set of minimum cross-entropy (or mutual information) error measures, known as Csiszar's measures, are deduced in terms of probabilistic attributes of the 'guess' (output) and 'true' (teacher) value parameters pertinent to neural network topologies. Their relative effectiveness in training a neural network optimally towards convergence (by realizing a predicted output close to the teacher function) is discussed with simulated results obtained from a test multi-layer perceptron. The Csiszar family of error measures indicated in this paper offers an alternative set of error functions defined over a training set which can be adopted towards gradient-descent learnings in neural networks using the backpropagation algorithm in lieu of the conventional SE and/or RE error measures. Relevant pros and cons of using Csiszar's error measures are discussed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call