Abstract

We present a novel mathematical model that seeks to capture the key design feature of generative adversarial networks (GANs). Our model consists of two interacting spin glasses, and we conduct an extensive theoretical analysis of the complexity of the model’s critical points using techniques from Random Matrix Theory. The result is insights into the loss surfaces of large GANs that build upon prior insights for simpler networks, but also reveal new structure unique to this setting which explains the greater difficulty of training GANs.

Highlights

  • By making various modeling assumptions about standard multi-layer perceptron neural networks, [1] argued heuristically that the training loss surfaces of large networks could be modelled by a spherical multi-spin glass

  • 29 Page 2 of 45 real-world deep neural networks do not behave like random matrices from the Gaussian Orthogonal Ensemble (GOE) of Random Matrix Theory at the macroscopic scale [4,5,6], despite this being implied by the spin-glass model of [1]

  • We have contributed a novel model for the study of large neural network gradient descent dynamics with statistical physics techniques, namely an interacting spin-glass model for generative adversarial neural networks

Read more

Summary

Introduction

By making various modeling assumptions about standard multi-layer perceptron neural networks, [1] argued heuristically that the training loss surfaces of large networks could be modelled by a spherical multi-spin glass. We compare the effect of these parameters on our spin glass model and on the results of experiments training real GANs. Our calculations include some novel details, in particular, we use precise sub-leading terms for a limiting spectral density obtained from supersymmetric methods to prove a required concentration result to justify the use of the Coulomb gas approximation. We provide a first attempt to model an important architectural feature of modern deep neural networks within the framework of spin glass models and provide a detailed analysis of properties of the resulting loss (energy) surface. All code used for numerical calculations of our model, training real GANs, analysing the results and generating plots is made available

An Interacting Spin Glass Model
29 Page 4 of 45
Kac-Rice Formulae for Complexity
29 Page 6 of 45
29 Page 8 of 45
Limiting Spectral Density of the Hessian
29 Page 10 of 45
29 Page 12 of 45
The Asymptotic Complexity
29 Page 14 of 45
29 Page 16 of 45
29 Page 18 of 45
29 Page 20 of 45
29 Page 22 of 45
29 Page 24 of 45
Structure of Low-Index Critical Points
29 Page 26 of 45
Hyperparameter Effects
29 Page 28 of 45
Effect of Variance Ratio
Effect of Size Ratio
Conclusions and Outlook
29 Page 30 of 45
29 Page 32 of 45
29 Page 36 of 45
29 Page 42 of 45
29 Page 44 of 45
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call