Data-Dependent Conditional Priors for Unsupervised Learning of Multimodal Data.

Frantzeska Lavda,Magda Gregorová,Alexandros Kalousis

doi:10.3390/e22080888

Abstract

One of the major shortcomings of variational autoencoders is the inability to produce generations from the individual modalities of data originating from mixture distributions. This is primarily due to the use of a simple isotropic Gaussian as the prior for the latent code in the ancestral sampling procedure for data generations. In this paper, we propose a novel formulation of variational autoencoders, conditional prior VAE (CP-VAE), with a two-level generative process for the observed data where continuous and a discrete variables are introduced in addition to the observed variables . By learning data-dependent conditional priors, the new variational objective naturally encourages a better match between the posterior and prior conditionals, and the learning of the latent categories encoding the major source of variation of the original data in an unsupervised manner. Through sampling continuous latent code from the data-dependent conditional priors, we are able to generate new samples from the individual mixture components corresponding, to the multimodal structure over the original data. Moreover, we unify and analyse our objective under different independence assumptions for the joint distribution of the continuous and discrete latent variables. We provide an empirical evaluation on one synthetic dataset and three image datasets, FashionMNIST, MNIST, and Omniglot, illustrating the generative performance of our new model comparing to multiple baselines.

Highlights

Variational autoencoders (VAEs) [1,2] are deep generative models for learning complex data distributions
We propose a new VAE formulation, conditional prior VAE (CP-VAE), with a conditionally structured latent representation that encourages a better match between the prior and the posterior distributions by jointly learning their parameters from the data
We introduce CP-VAE, an unsupervised generative model that is able to learn the multi-modal probabilistic structure of the data

Summary

Introduction

Variational autoencoders (VAEs) [1,2] are deep generative models for learning complex data distributions. They consist of an encoding and decoding network parametrizing the variational approximate posterior and the conditional data distributions in a latent variable generative model. Multiple strategies have been proposed to increase the richness or interpretability of the latent code [3,4,5,6,7,8,9,10,11,12] These mostly argue for more flexible posterior inference procedure or for the use of more complex approximate posterior distributions to facilitate the encoding of non-trivial data structures within the latent space. We propose a new VAE formulation, conditional prior VAE (CP-VAE), with two-level hierarchical generative model combining categorical and continuous (Gaussian) latent variables

Objectives

Results

Conclusion