Abstract

Generative adversarial networks (GANs) were first proposed in 2014, and have been widely used in computer vision, such as for image generation and other tasks. However, the GANs used for text generation have made slow progress. One of the reasons is that the discriminator’s guidance for the generator is too weak, which means that the generator can only get a “true or false” probability in return. Compared with the current loss function, the Wasserstein distance can provide more information to the generator, but RelGAN does not work well with Wasserstein distance in experiments. In this paper, we propose an improved neural network based on RelGAN and Wasserstein loss named WRGAN. Differently from RelGAN, we modified the discriminator network structure with 1D convolution of multiple different kernel sizes. Correspondingly, we also changed the loss function of the network with a gradient penalty Wasserstein loss. Our experiments on multiple public datasets show that WRGAN outperforms most of the existing state-of-the-art methods, and the Bilingual Evaluation Understudy(BLEU) scores are improved with our novel method.

Highlights

  • A generative adversarial network (GAN) [1] is an unsupervised learning method that learns by letting two neural networks play against each other

  • Our experiments on multiple public datasets show that WRGAN outperforms most of the existing state-of-the-art methods, and the Bilingual Evaluation Understudy(BLEU) scores are improved with our novel method

  • We propose a new architecture based on RelGAN and Wasserstein GAN (WGAN)-GP

Read more

Summary

Introduction

A generative adversarial network (GAN) [1] is an unsupervised learning method that learns by letting two neural networks play against each other. When a GAN faces discrete data, the discriminator cannot pass the gradient to the generator through backward propagation [2]. This solution focuses on dealing with non-differentiable problems caused by discrete data by considering RL methods or reformulating problems in continuous space [5]. Using this method will make the GAN more challenging to train and will cause the mode collapse problem. For training GANs, and multiple representations embedded in the discriminator [5] This model performs very well on many datasets. We used the discriminator and the generator with relational memory coordinated by Gumbel–Softmax relaxation to train the GAN model on discrete data.

Related Works
Overall Framework
Rebuilding the Discriminator
Loss Function
Training Parameters
Experiments
Evaluation Metrics
BLEU and NLL Scores
Comparison of RelGAN and WRGAN on COCO
EMNLP 2017 WMT News
Chinese Poetry
Impact of Dimension
Impact of k
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call