Abstract

In a recent work (Halverson et al 2021 Mach. Learn.: Sci. Technol. 2 035002), Halverson, Maiti and Stoner proposed a description of neural networks (NNs) in terms of a Wilsonian effective field theory. The infinite-width limit is mapped to a free field theory while finite N corrections are taken into account by interactions (non-Gaussian terms in the action). In this paper, we study two related aspects of this correspondence. First, we comment on the concepts of locality and power-counting in this context. Indeed, these usual space-time notions may not hold for NNs (since inputs can be arbitrary), however, the renormalization group (RG) provides natural notions of locality and scaling. Moreover, we comment on several subtleties, for example, that data components may not have a permutation symmetry: in that case, we argue that random tensor field theories could provide a natural generalization. Second, we improve the perturbative Wilsonian renormalization from Halverson et al (2021 Mach. Learn.: Sci. Technol. 2 035002) by providing an analysis in terms of the non-perturbative RG using the Wetterich-Morris equation. An important difference with usual non-perturbative RG analysis is that only the effective infrared 2-point function is known, which requires setting the problem with care. Our aim is to provide a useful formalism to investigate NNs behavior beyond the large-width limit (i.e. far from Gaussian limit) in a non-perturbative fashion. A major result of our analysis is that changing the standard deviation of the NN weight distribution can be interpreted as a renormalization flow in the space of networks. We focus on translations invariant kernels and provide preliminary numerical results.

Highlights

  • Introduction and outlineDeep learning and neural networks (NNs) [2, 3] have experienced a rapid development in the last decade, with an ever-increasing number of remarkable applications

  • We introduce two notions of scales which emerge from the analysis: the first is attached to the data and called

  • We have pushed further the use of the renormalization group for the NNQFT correspondence [1, 33], which states that a neural network can be represented by a quantum field theory

Read more

Summary

Introduction

Deep learning and neural networks (NNs) [2, 3] have experienced a rapid development in the last decade, with an ever-increasing number of remarkable applications. In many cases, these systems outperform humans and ordinary algorithms. There is no complete theoretical understanding of why deep learning works so well and how to improve it further. It is not clear how training can be made more efficient and fast, how knowledge can be transferred to other tasks or how to choose hyperparameters systematically.

Objectives
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call