Information-theoretic regularization for learning global features by sequential VAE

Kei Akuzawa,Yutaka Matsuo,Yusuke Iwasawa

doi:10.1007/s10994-021-06032-4

Abstract

Sequential variational autoencoders (VAEs) with a global latent variable z have been studied for disentangling the global features of data, which is useful for several downstream tasks. To further assist the sequential VAEs in obtaining meaningful z, existing approaches introduce a regularization term that maximizes the mutual information (MI) between the observation and z. However, by analyzing the sequential VAEs from the information-theoretic perspective, we claim that simply maximizing the MI encourages the latent variable to have redundant information, thereby preventing the disentanglement of global features. Based on this analysis, we derive a novel regularization method that makes z informative while encouraging disentanglement. Specifically, the proposed method removes redundant information by minimizing the MI between z and the local features by using adversarial training. In the experiments, we trained two sequential VAEs, state-space and autoregressive model variants, using speech and image datasets. The results indicate that the proposed method improves the performance of downstream classification and data generation tasks, thereby supporting our information-theoretic perspective for the learning of global features.

Highlights

Uncovering the global factors of variation from high-dimensional data is a significant and relevant problem in representation learning (Bengio et al 2013)
Sequential variational autoencoders (VAEs) with a global latent variable z play an important role in the unsupervised learning of global features
A typical issue is that the latent variable z is ignored by a decoder (SSMs or autoregressive models (ARMs)) and becomes uninformative, which is referred to as posterior collapse (PC). This phenomenon occurs as follows: with expressive decoders, such as state-space models (SSMs) or ARMs, the additional latent variable z cannot assist in improving the evidence lower bound (ELBO), which is the objective function of VAEs; the decoders will not use z (Chen et al 2017; Alemi et al 2018)

Summary

Introduction

Uncovering the global factors of variation from high-dimensional data is a significant and relevant problem in representation learning (Bengio et al 2013). This phenomenon occurs as follows: with expressive decoders, such as SSMs or ARMs, the additional latent variable z cannot assist in improving the evidence lower bound (ELBO), which is the objective function of VAEs; the decoders will not use z (Chen et al 2017; Alemi et al 2018) To alleviate this issue, existing approaches regularize the mutual information (MI) between x and z to be large by using -VAE (Alemi et al 2018) or adversarial training (Makhzani and Frey 2017), for example. We evaluated the ability of controlled generation using a novel evaluation method inspired by Ravuri and Vinyals (2019), and confirmed that CMImaximizing regularization consistently outperformed MI-maximizing regularization These results support (1) our information-theoretic view of learning global features: the sequential VAEs can suffer from obtaining redundant features when merely maximizing the MI. I(x; z) and I(z; s) are shown to work complementarily in our experiments using two models and two domains (speech and image datasets), indicating that it would help improve various sequential VAEs proposed previously

Sequential VAEs for learning global representations

State space model with global latent variable

Autoregressive model with global latent variable

Mutual information‐maximizing regularization for sequential VAEs

Problem in MI‐maximizing regularization

Conditional mutual information‐maximizing regularization

Estimation method of the regularization term

Objective function for DSAEs and PixelCNN‐VAEs

Related works

Settings

Speaker verification experiment with disentangled sequential autoencoders

Unsupervised learning for image classification

Controlled generation

Findings

Discussions and future works

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Machine Learning	Publication Date: Jul 7, 2021
Citations: 3	License type: open-access

R Discovery Prime

R Discovery Prime

Information-theoretic regularization for learning global features by sequential VAE

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Machine Learning

Lead the way for us

Similar Papers

Discovering influential factors in variational autoencoders
Shiqi Liu ... Sheng Liu
Pattern Recognition | VOL. 100
Shiqi Liu, et. al.Shiqi Liu ... Sheng Liu
15 Dec 2019
Pattern Recognition | VOL. 100

UPSNet: Universal Point Cloud Sampling Network Without Knowing Downstream Tasks
Fujing Tian ... Zhidi Jiang
Information Technology and Control | VOL. 51
Fujing Tian, et. al.Fujing Tian ... Zhidi Jiang
12 Dec 2022
Information Technology and Control | VOL. 51

Joint Coding of Local and Global Deep Features in Videos for Visual Search.
Lin Ding ... Yonghong Tian
IEEE Transactions on Image Processing | VOL. 29
Lin Ding, et. al.Lin Ding ... Yonghong Tian
01 Jan 2020
IEEE Transactions on Image Processing | VOL. 29

Author response: A connectomics-based taxonomy of mammals
Laura E Suarez ... Martijn P van den Heuvel
-
Laura E Suarez, et. al.Laura E Suarez ... Martijn P van den Heuvel
10 Oct 2022
10 Oct 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Information-theoretic regularization for learning global features by sequential VAE

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Machine Learning