Abstract

Restricted Boltzmann Machines (RBMs), two-layered probabilistic graphical models that can also be interpreted as feed forward neural networks, enjoy much popularity for pattern analysis and generation. Training RBMs however is challenging. It is based on likelihood maximization, but the likelihood and its gradient are computationally intractable. Therefore, training algorithms such as Contrastive Divergence (CD) and learning based on Parallel Tempering (PT) rely on Markov chain Monte Carlo methods to approximate the gradient. The presented thesis contributes to understanding RBM training methods by presenting an empirical and theoretical analysis of the bias of the CD approximation and a bound on the mixing rate of PT. Furthermore, the thesis improves RBM training by proposing a new transition operator leading to faster mixing Markov chains, by investigating a different parameterization of the RBM model class referred to as centered RBMs, and by exploring estimation techniques from statistical physics to approximate the likelihood. Finally, an analysis of the representational power of deep belief networks with real-valued visible variables is given.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.