μ-Forcing

Dayiheng Liu,Yuanyuan Chen,Feng He,Yang Xue,Jiancheng Lv

doi:10.1145/3341110

Abstract

It has been previously observed that training Variational Recurrent Autoencoders (VRAE) for text generation suffers from serious uninformative latent variables problems. The model would collapse into a plain language model that totally ignores the latent variables and can only generate repeating and dull samples. In this article, we explore the reason behind this issue and propose an effective regularizer-based approach to address it. The proposed method directly injects extra constraints on the posteriors of latent variables into the learning process of VRAE, which can flexibly and stably control the tradeoff between the Kullback-Leibler (KL) term and the reconstruction term, making the model learn dense and meaningful latent representations. The experimental results show that the proposed method outperforms several strong baselines and can make the model learn interpretable latent variables and generate diverse meaningful sentences. Furthermore, the proposed method can perform well without using other strategies, such as KL annealing.

Full Text