Abstract

The Boltzmann machine uses the relative entropy as a cost function to fit the Boltzmann distribution to a fixed given distribution. Instead of the relative entropy, we use the mutual information between input and output units to define an unsupervised analogy to the conventional Boltzmann machine. Our network of Ising spins is fed by an external field via the input units. The output units should self-organize to form an ``internal'' representation of the ``environmental'' input, thereby compressing the data and extracting relevant features. The mutual information and its gradient with respect to the weights principally require nonlocal information, e.g., in the form of multipoint correlation functions. Hence the exact gradient can hardly be boiled down to a local learning rule. Conversely, by using only local terms and two-point interactions, the entropy of the output layer cannot be ensured to reach the maximum possible entropy for a fixed number of output neurons. Some redundancy may remain in the representation of the data at the output. We account for this limitation from the very beginning by reformulating the cost function correspondingly. From this cost function, local Hebb-like learning rules can be derived. Some experiments with these local learning rules are presented.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call