Abstract

Learning from examples in feedforward neural networks is studied using equilibrium statistical mechanics. Such an analysis is valid for stochastic learning algorithms that lead to a Gibbs distribution in the network weight space. Two simple approximations to the exact theory are presented: the high temperature limit and the annealed approximation. Within these approximations, we study models of perceptron learning of realizable target rules. In each model, the target rule is perfectly realizable because it is given by another perceptron of identical architecture. We focus on the generalization curve, i.e. the average generalization error as a function of the number of examples. For continuously varying weights learning is known to be gradual, with generalization curves that asymptotically obey inverse power laws. Here we discuss two model perceptrons, with weights that are constrained to be discrete, that exhibit sudden learning. For a linear output, there is a first-order transition occurring at low temperatures, from a state of poor generalization to a state of good generalization. Beyond the transition, the generalization error decays exponentially to zero. For a boolean output, the first order transition is to perfect generalization at all temperatures. Monte Carlo simulations confirm that these approximate analytical results are quantitatively accurate at high temperatures and qualitatively correct at low temperatures. For unrealizable rules, however, the annealed approximation breaks down in general. Finally, we propose a general classification of generalization curves in models of realizable rules.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call