Improving Variational Autoencoder for Text Modelling with Timestep-Wise Regularisation

Ruizhe Li,Xiao Li,Guanyi Chen,Chenghua Lin

doi:10.18653/v1/2020.coling-main.216

Abstract

The Variational Autoencoder (VAE) is a popular and powerful model applied to text modelling to generate diverse sentences. However, an issue known as posterior collapse (or KL loss vanishing) happens when the VAE is used in text modelling, where the approximate posterior collapses to the prior, and the model will totally ignore the latent variables and be degraded to a plain language model during text generation. Such an issue is particularly prevalent when RNN-based VAE models are employed for text modelling. In this paper, we propose a simple, generic architecture called Timestep-Wise Regularisation VAE (TWR-VAE), which can effectively avoid posterior collapse and can be applied to any RNN-based VAE models. The effectiveness and versatility of our model are demonstrated in different tasks, including language modelling and dialogue response generation.

Highlights

Variational Autoencoders (VAE) (Kingma and Welling, 2014; Rezende et al, 2014), together with other deep generative models, including Generative Adversarial Networks (Goodfellow et al, 2014) and autoregressive models (Oord et al, 2018), have attracted a mass of attention in the research community as they have shown their ability to learn compact representations from complex, high-dimensional unlabelled data
We evaluate our Timestep-Wise Regularisation VAE (TWR-VAE) model on three public benchmark datasets, namely, Penn Treebank (PTB) (Marcus and Marcinkiewicz, 1993), Yelp15 (Yang et al, 2017), and Yahoo (Zhang et al, 2015), which have been widely used in previous work for text modelling (Bowman et al, 2016; Kim et al, 2018; Fu et al, 2019; He et al, 2019; Zhu et al, 2020)
We report the performance on four metrics: negative log likelihood (NLL), perplexity (PPL), KLdivergence which measures the distance between two probability distributions, and the mutual information of the input x and the latent variable z, which measures how much information of x is obtained by z

Summary

Introduction

Variational Autoencoders (VAE) (Kingma and Welling, 2014; Rezende et al, 2014), together with other deep generative models, including Generative Adversarial Networks (Goodfellow et al, 2014) and autoregressive models (Oord et al, 2018), have attracted a mass of attention in the research community as they have shown their ability to learn compact representations from complex, high-dimensional unlabelled data. TWR-VAE shares some similarity with existing VAE-RNN models, where the input to the decoder is the latent variable sample from the variational posterior at the final timestep of the encoder. While this is a reasonable design choice, we explore two model variants of TWR-VAE, namely, TWR-VAEmean and TWR-VAEsum. The contribution of our paper are three-fold: (1) we propose a simple and robust method, which can effectively alleviate the posterior collapse issue of VAE via timestep-wise regularisation; (2) our approach is generic which can be applied to any RNN-based VAE models; (3) our approach outperforms the state-of-art on language modelling and yields better or comparable performance on dialogue response generation. The code of TWR-VAE is available at: https://github.com/ruizheliUOA/TWR-VAE

Related Work

Background of VAE

TWR-VAEmean and TWR-VAEsum

Language Modelling

Dialogue Response Generation

Conclusion

B The reparameterisation trick for our timestep-wise latent variables

D Training Details for Language Modelling

F Training Details for Dialogue Response Generation

G Examples of the latent representation interpolation on the Yahoo test dataset

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving Variational Autoencoder for Text Modelling with Timestep-Wise Regularisation

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2020
Citations: 6	License type: cc-by

Similar Papers

Implicit Deep Latent Variable Models for Text Generation
Le Fang ... Jianfeng Gao
-
Le Fang, et. al.Le Fang ... Jianfeng Gao
01 Jan 2019
01 Jan 2019

A Batch Normalized Inference Network Keeps the KL Vanishing Away
Qile Zhu ... Wei Bi
-
Qile Zhu, et. al.Qile Zhu ... Wei Bi
01 Jan 2020
01 Jan 2020

Enhancing Variational Autoencoders with Mutual Information Neural Estimation for Text Generation
Dong Qian ... William K Cheung
-
Dong Qian, et. al.Dong Qian ... William K Cheung
01 Jan 2019
01 Jan 2019

Learning Hierarchical Variational Autoencoders With Mutual Information Maximization for Autoregressive Sequence Modeling.
Dong Qian ... William K Cheung
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 45
Dong Qian, et. al.Dong Qian ... William K Cheung
01 Feb 2023
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 45

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving Variational Autoencoder for Text Modelling with Timestep-Wise Regularisation

Abstract

Highlights

Summary

Talk to us

Similar Papers