Abstract

Learning latent variable models with deep top-down architectures typically requires inferring the latent variables for each training example based on the posterior distribution of these latent variables. The inference step typically relies on either time-consuming long-run Markov chain Monte Carlo (MCMC) sampling or a separate inference model for variational learning. In this paper, we propose to use a shortrun MCMC, such as a short-run Langevin dynamics, as an approximate flow-based inference engine. The bias existing in the output distribution of the non-convergent short-run Langevin dynamics is corrected by the optimal transport (OT), which aims at transforming the biased distribution produced by the finite-step MCMC to the prior distribution with a minimum transport cost. Our experiments not only verify the effectiveness of the OT correction for the short-run MCMC, but also demonstrate that the latent variable model trained by the proposed strategy performs better than the variational auto-encoder (VAE) in terms of image reconstruction/generation and anomaly detection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call