Abstract
The Boltzmann machine uses the relative entropy as a cost function to fit the Boltzmann distribution to a fixed given distribution. Instead of the relative entropy, we use the mutual information between input and output units to define an unsupervised analogy to the conventional Boltzmann machine. Our network of Ising spins is fed by an external field via the input units. The output units should self-organize to form an ``internal'' representation of the ``environmental'' input, thereby compressing the data and extracting relevant features. The mutual information and its gradient with respect to the weights principally require nonlocal information, e.g., in the form of multipoint correlation functions. Hence the exact gradient can hardly be boiled down to a local learning rule. Conversely, by using only local terms and two-point interactions, the entropy of the output layer cannot be ensured to reach the maximum possible entropy for a fixed number of output neurons. Some redundancy may remain in the representation of the data at the output. We account for this limitation from the very beginning by reformulating the cost function correspondingly. From this cost function, local Hebb-like learning rules can be derived. Some experiments with these local learning rules are presented.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.