Efficient training of energy-based models using Jarzynski equality**This article is an updated version of: Carbone D, Hua M, Coste S and Vanden-Eijnden E 2023 Efficient training of energy-based models using Jarzynski equality Advances in Neural Information Processing Systems vol 36, ed A Oh, T Naumann, A Globerson, K Saenko, M Hardt and S Levine (Curran Associates, Inc.) pp 52583–614.

Davide Carbone,Mengjian Hua,Simon Coste,Eric Vanden-Eijnden

doi:10.1088/1742-5468/ad65e0

Davide Carbone, Mengjian Hua + Show 2 more

Open Access

https://doi.org/10.1088/1742-5468/ad65e0

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Abstract Energy-based models (EBMs) are generative models inspired by statistical physics with a wide range of applications in unsupervised learning. Their performance is well measured by the cross-entropy (CE) of the model distribution relative to the data distribution. Using the CE as the objective for training is, however, challenging because the computation of its gradient with respect to the model parameters requires sampling of the model distribution. Here, we show how the results for nonequilibrium thermodynamics based on the Jarzynski equality together with tools from sequential Monte Carlo sampling can be used to perform this computation efficiently and avoid the uncontrolled approximations made using the standard contrastive divergence algorithm. Specifically, we introduce a modification of the unadjusted Langevin algorithm (ULA), in which each walker acquires a weight that enables the estimation of the gradient of the CE at any step during gradient descent, thereby bypassing sampling biases induced by slow mixing of the ULA. We illustrate these results with numerical experiments on Gaussian mixture distributions as well as the MNIST and CIFAR-10 datasets. We show that the proposed approach outperforms methods based on the contrastive divergence algorithm in all the considered situations.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Statistical Mechanics: Theory and Experiment	Publication Date: Oct 21, 2024
Citations: 1	License type: cc-by

R Discovery Prime

Abstract

Published Version

Talk to us

Similar Papers

More From: Journal of Statistical Mechanics: Theory and Experiment

Lead the way for us

Similar Papers

A high-bias, low-variance introduction to Machine Learning for physicists
Pankaj Mehta ... David J Schwab
Physics reports | VOL. 810
Pankaj Mehta, et. al.Pankaj Mehta ... David J Schwab
14 Mar 2019
Physics reports | VOL. 810

Efficient training of energy-based models via spin-glass control
Alejandro Pozas-Kerstjens ... Eloy Piñol
Machine Learning: Science and Technology | VOL. 2
Alejandro Pozas-Kerstjens, et. al.Alejandro Pozas-Kerstjens ... Eloy Piñol
15 Apr 2021
Machine Learning: Science and Technology | VOL. 2

ResNet Autoencoders for Unsupervised Feature Learning From High-Dimensional Data: Deep Models Resistant to Performance Degradation
Chathurika S Wickramasinghe ... Daniel L Marino
IEEE Access | VOL. 9
Chathurika S Wickramasinghe, et. al.Chathurika S Wickramasinghe ... Daniel L Marino
01 Jan 2020
IEEE Access | VOL. 9

Open Quantum Dynamics Theory for Non-Equilibrium Work: Hierarchical Equations of Motion Approach
Souichi Sakamoto ... Yoshitaka Tanimura
Journal of the Physical Society of Japan | VOL. 90
Souichi Sakamoto, et. al.Souichi Sakamoto ... Yoshitaka Tanimura
15 Mar 2021
Journal of the Physical Society of Japan | VOL. 90

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Abstract

Published Version

Talk to us

Similar Papers

More From: Journal of Statistical Mechanics: Theory and Experiment