Abstract

Training a deep generative adversarial network (GAN) with hundreds or even thousands of layers is difficult. The backpropagation depth of generator is deeper than discriminator, leading it to occur vanishing/exploding gradients easily. This paper proposes a method to train deep vanilla GAN based on mean field theory. By adjusting the parameter variances and activation of the GAN, a 200-layer vanilla GAN can be trained steadily without adding any batch normalization layers or residual blocks. We demonstrate that deep GAN is very sensitive to the parameter variances $\sigma _w^2$ , $\sigma _b^2$ in the initialization scheme, and explain why hard tanh is more suitable than relu as an activation in a deep vanilla GAN. Experiments on the MNIST and Fashion-MNIST data sets validate that our method trains a deep vanilla GAN well and can produce high-quality images.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call