Enhancing Variational Autoencoders with Mutual Information Neural Estimation for Text Generation

Dong Qian,William K Cheung

doi:10.18653/v1/d19-1416

Abstract

While broadly applicable to many natural language processing (NLP) tasks, variational autoencoders (VAEs) are hard to train due to the posterior collapse issue where the latent variable fails to encode the input data effectively. Various approaches have been proposed to alleviate this problem to improve the capability of the VAE. In this paper, we propose to introduce a mutual information (MI) term between the input and its latent variable to regularize the objective of the VAE. Since estimating the MI in the high-dimensional space is intractable, we employ neural networks for the estimation of the MI and provide a training algorithm based on the convex duality approach. Our experimental results on three benchmark datasets demonstrate that the proposed model, compared to the state-of-the-art baselines, exhibits less posterior collapse and has comparable or better performance in language modeling and text generation. We also qualitatively evaluate the inferred latent space and show that the proposed model can generate more reasonable and diverse sentences via linear interpolation in the latent space.

Highlights

Deep learning architectures are parameterized by families of non-linear functions, which learn multiple levels of more abstract representations (Bengio, 2009; Bengio et al, 2013)
We report negative log-likelihood (NLL), KL divergence (KL), perplexity (PPL), mutual information (MI), the number of active units (AU), forward perplexity (FPPL) and reverse perplexity (RPPL)
With the mutual information term considered for the optimization, we believe that variational autoencoders (VAEs)-MINE allows more reasonable correspondence patterns between the input and its inferred latent variable so as to better alleviate posterior collapse

Summary

Introduction

Deep learning architectures are parameterized by families of non-linear functions, which learn multiple levels of more abstract representations (Bengio, 2009; Bengio et al, 2013). The goal is to learn a compact representation to capture the salient structure in a given highly complex high-dimensional unlabelled data so that new data with some variations can be generated. They have been widely applied to a range of NLP tasks, such as language modeling (Bowman et al, 2016; Zhao et al, 2018a), dialog generation (Zhao et al, 2017, 2018b), etc. We focus on the VAE with recurrent neural networks as its encoder and decoder for text generation. The one-stepahead predictions force RNNs to learn local correlations, rather than global coherence This is insufficient to capture high-level abstractions which characterize text sequences. The prior p(z) is assumed a standard Gaussian distribution N (0, I)

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Enhancing Variational Autoencoders with Mutual Information Neural Estimation for Text Generation

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2019
Citations: 40	License type: cc-by

Similar Papers

Text Generation with Syntax - Enhanced Variational Autoencoder
Weijie Yuan ... Gongshen Liu
-
Weijie Yuan, et. al.Weijie Yuan ... Gongshen Liu
18 Jul 2021
18 Jul 2021

A brief analysis of ChatGPT：historical evolution， current applications，and future prospects
Liu Yuliang ... Jin Lianwen
Journal of Image and Graphics | VOL. 28
Liu Yuliang, et. al.Liu Yuliang ... Jin Lianwen
01 Jan 2023
Journal of Image and Graphics | VOL. 28

Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Jingfeng Yang ... Qizhang Feng
ACM Transactions on Knowledge Discovery from Data | VOL. 18
Jingfeng Yang, et. al.Jingfeng Yang ... Qizhang Feng
26 Apr 2024
ACM Transactions on Knowledge Discovery from Data | VOL. 18

Improving Variational Autoencoder for Text Modelling with Timestep-Wise Regularisation
Ruizhe Li ... Xiao Li
-
Ruizhe Li, et. al.Ruizhe Li ... Xiao Li
01 Jan 2020
01 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Enhancing Variational Autoencoders with Mutual Information Neural Estimation for Text Generation

Abstract

Highlights

Summary

Talk to us

Similar Papers