Equilibrium and non-equilibrium regimes in the learning of restricted Boltzmann machines* *This article is an updated version of: Decelle A, Furtlehner C and Seoane B 2021 Equilibrium and non-equilibrium regimes in the learning of restricted Boltzmann machines Advances in Neural Information Processing Systems vol 34 ed M Ranzato, A Beygelzimer, Y Dauphin, P S Liang and J Wortman Vaughan (New

Aurélien Decelle,Beatriz Seoane,Cyril Furtlehner

doi:10.1088/1742-5468/ac98a7

Aurélien Decelle, Beatriz Seoane + Show 1 more

Open Access

https://doi.org/10.1088/1742-5468/ac98a7

Copy DOI

Abstract

Training restricted Boltzmann machines (RBMs) have been challenging for a long time due to the difficulty of precisely computing the log-likelihood gradient. Over the past few decades, many works have proposed more or less successful training recipes but without studying the crucial quantity of the problem: the mixing time, i.e. the number of Monte Carlo iterations needed to sample new configurations from a model. In this work, we show that this mixing time plays a crucial role in the dynamics and stability of the trained model, and that RBMs operate in two well-defined regimes, namely equilibrium and out-of-equilibrium, depending on the interplay between this mixing time of the model and the number of steps, k, used to approximate the gradient. We further show empirically that this mixing time increases with the learning, which often implies a transition from one regime to another as soon as k becomes smaller than this time. In particular, we show that using the popular k (persistent) contrastive divergence approaches, with k small, the dynamics of the learned model are extremely slow and often dominated by strong out-of-equilibrium effects. On the contrary, RBMs trained in equilibrium display faster dynamics, and a smooth convergence to dataset-like configurations during the sampling. Finally, we discuss how to exploit in practice both regimes depending on the task one aims to fulfill: (i) short k can be used to generate convincing samples in short learning times, (ii) large k (or increasingly large) is needed to learn the correct equilibrium distribution of the RBM. Finally, the existence of these two operational regimes seems to be a general property of energy based models trained via likelihood maximization.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Statistical Mechanics: Theory and Experiment	Publication Date: Nov 1, 2022
Citations: 4	License type: iop-standard

R Discovery Prime

R Discovery Prime

Abstract

Talk to us

Similar Papers

More From: Journal of Statistical Mechanics: Theory and Experiment

Lead the way for us

Similar Papers

On CPU Performance Optimization of Restricted Boltzmann Machine and Convolutional RBM
Baptiste Wicht ... Andreas Fischer
-
Baptiste Wicht, et. al.Baptiste Wicht ... Andreas Fischer
01 Jan 2015
01 Jan 2015

Accelerate Training of Restricted Boltzmann Machines via Iterative Conditional Maximum Likelihood Estimation.
Mingqi Wu ... Faming Liang
Statistics and its interface | VOL. 12
Mingqi Wu, et. al.Mingqi Wu ... Faming Liang
01 Jan 2019
Statistics and its interface | VOL. 12

Restricted Boltzmann Machines Without Random Number Generators for Efficient Digital Hardware Implementation
Sansei Hori ... Takashi Morie
-
Sansei Hori, et. al.Sansei Hori ... Takashi Morie
01 Jan 2015
01 Jan 2015

Training Restricted Boltzmann Machines
Asja Fischer
KI - Künstliche Intelligenz | VOL. 29
Asja FischerAsja Fischer
12 May 2015
KI - Künstliche Intelligenz | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Abstract

Talk to us

Similar Papers

More From: Journal of Statistical Mechanics: Theory and Experiment