Abstract

We discuss the properties of the distributions of energies of minima obtained by gradient descent in complex energy landscapes. We find strikingly similar phenomenology across several prototypical models. We particularly focus on the distribution of energies of minima in the analytically well-understood p-spin-interaction spin glass model. We numerically find non-Gaussian distributions that resemble the Tracy-Widom distributions often found in problems of random correlated variables, and non-trivial finite-size scaling. Based on this, we propose a picture of gradient descent dynamics that highlights the importance of a first-passage process in the eigenvalues of the Hessian. This picture provides a concrete link to problems in which the Tracy-Widom distribution is established. Aspects of this first-passage view of gradient-descent dynamics are generic for non-convex complex landscapes, rationalizing the commonality that we find across models.

Highlights

  • The notion of an underlying complex energy landscape in glassy, disordered systems is useful [1,2,3,4,5,6,7,8] to the extent that the landscape can be reduced to relatively few properties that are relevant to observed phenomena

  • The additional inertial degree of freedom within the FIRE scheme can in some individual cases change the basin of attraction such that the relaxation from a specific initial condition with it leads to a different final minimum than would application of a direct gradient descent

  • Such Dysonian dynamics are not general, as we discuss in the final section, but reflect key elements that are true for the evolution of any dynamical matrix

Read more

Summary

INTRODUCTION

The notion of an underlying complex energy landscape in glassy, disordered systems is useful [1,2,3,4,5,6,7,8] to the extent that the landscape can be reduced to relatively few properties that are relevant to observed phenomena. We look at the shape of the distribution of minima obtained by gradient descent for several different models, with particular focus on the spherical p-spin-interaction spin glass. Such distributions, for example, for jamming, have been assumed to be Gaussian [17]. Our central finding is that for all of these models, the distributions are non-Gaussian with nontrivial tail exponents on one side that are consistent with the Tracy-Widom distribution, a distribution mostly known for describing the edge fluctuations of the eigenvalues of Gaussian random matrices We rationalize this finding with a perspective that might be the starting point for an eventual analytical approach.

MODELS AND COMPLEXITY
The p-spin model
The k-SAT model
The perceptron model
Jamming
NUMERICAL RESULTS
RATIONALIZATION OF RESULTS
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call