Abstract

Entropy regularized optimal transport (EOT) distance and its symmetric normalization, known as the Sinkhorn divergence, offer smooth and continuous metrized weak-convergence distance metrics. They have excellent geometric properties and are useful to compare probability distributions in some generative adversarial network (GAN) models. Computing them using the original Sinkhorn matrix scaling algorithm is still expensive. The running time is quadratic at $\mathcal {O}(n^{2})$ in the size $n$ of the training dataset. This work investigates the problem of accelerating the GAN training when Sinkhorn divergence is used as a minimax objective. Let $\mathcal {G}$ be a Gaussian map from the ground space onto the positive orthant $\mathbb {R}_{+}^{r}$ with $r \ll n $ . To speed up the divergence computation, we propose the use of $c(x,y)= - \varepsilon \log \left \langle{ \mathcal {G}(x),\mathcal {G}(y) }\right \rangle $ as the ground cost. This approximation, known as Sinkhorn with positive features, brings down the running time of the Sinkhorn matrix scaling algorithm to $\mathcal {O}(r \, n)$ , which is linear in $n$ . To solve the minimax optimization in GAN, we put forward a more efficient simultaneous stochastic gradient descent-ascent (SimSGDA) algorithm in place of the standard sequential gradient techniques. Empirical evidence shows that our model, trained using SimSGDA on the DCGAN neural architecture on tiny-coloured Cats and CelebA datasets, converges to stationary points. These are the local Nash equilibrium points. We carried out numerical experiments to confirm that our model is computationally stable. It generates samples of comparable quality to those produced by prior Sinkhorn and Wasserstein GANs. Further simulations, assessed on the similarity index measures (SSIM), show that our model’s empirical convergence rate is comparable to that of WGAN-GP.

Highlights

  • T HE introduction of the Wasserstein Generative Adversarial Network (GAN) (WGAN) in [1] and its subsequent exposition in [2] propelled optimal transport (OT) to popularity

  • IMPACT OF POSITIVE FEATURES We investigated the impact of using Sinkhorn with positive features on the selected hyperparameters

  • We have proposed a new variant of Sinkhorn GAN

Read more

Summary

INTRODUCTION

T HE introduction of the Wasserstein GAN (WGAN) in [1] and its subsequent exposition in [2] propelled optimal transport (OT) to popularity. The entropy regularized optimal transport (EOT) distance and its symmetric normalization, known as the Sinkhorn divergence, are powerful algorithmic tools. They help convexify the minimax objective and come with efficient solvers through the Sinkhorn matrix scaling algorithm. Despite theoretical progress on SimSGDA, prior Sinkhorn GAN implementations in [6], [7], [9] and [12] carried out the more expensive sequential game It is solved by a sequential SGDA (SeqSGDA), in which Dφ updates its parameters several times, either sequentially or in an alternate manner, before Gθ does. For an n × m real-valued matrix P , that is, P ∈ Rn×m, the row average

SGD Method Optimizer
CONTRIBUTIONS Our study yields the following contributions:
EOT PRIMAL FORMULATION
EOT DUAL FORMULATION
EOT SEMI-DUAL FORMULATION
LEARNING GANS WITH SIMSGDA ALGORITHM
OPTIMALITY OF SIMULTANEOUS GAME
EXPERIMENTS
COMPARISON TO WGAN
CONCLUDING REMARKS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call