Do ideas have shape? Idea registration as the continuous limit of artificial neural networks

Houman Owhadi

doi:10.1016/j.physd.2022.133592

Houman Owhadi

Open Access

https://doi.org/10.1016/j.physd.2022.133592

Copy DOI

Journal: Physica D: Nonlinear Phenomena	Publication Date: Nov 24, 2022
Citations: 4	License type: publisher-specific-oa

Affiliation: California Institute of Technology

Abstract

We introduce a Gaussian Process (GP) generalization of ResNets (with unknown functions of the network replaced by GPs and identified via MAP estimation), which includes ResNets (trained with L2 regularization on weights and biases) as a particular case (when employing particular kernels). We show that ResNets (and their warping GP regression extension) converge, in the infinite depth limit, to a generalization of image registration variational algorithms. In this generalization, images are replaced by functions mapping input/output spaces to a space of unexpressed abstractions (ideas), and material points are replaced by data points. Whereas computational anatomy aligns images via warping of the material space, this generalization aligns ideas (or abstract shapes as in Plato’s theory of forms) via the warping of the Reproducing Kernel Hilbert Space (RKHS) of functions mapping the input space to the output space. While the Hamiltonian interpretation of ResNets is not new, it was based on an Ansatz. We do not rely on this Ansatz and present the first rigorous proof of convergence of ResNets with trained weights and biases towards a Hamiltonian dynamics driven flow. Since our proof is constructive and based on discrete and continuous mechanics, it reveals several remarkable properties of ResNets and their GP generalization. ResNets regressors are kernel regressors with data-dependent warping kernels. Minimizers of L2 regularized ResNets satisfy a discrete least action principle implying the near preservation of the norm of weights and biases across layers. The trained weights of ResNets with scaled/strong L2 regularization can be identified by solving an autonomous Hamiltonian system. The trained ResNet parameters are unique up to (a function of) the initial momentum, and the initial momentum representation of those parameters is generally sparse. The kernel (nugget) regularization strategy provides a provably robust alternative to Dropout for ANNs. We introduce a functional generalization of GPs and show that pointwise GP/RKHS error estimates lead to probabilistic and deterministic generalization error estimates for ResNets. When performed with feature maps, the proposed analysis identifies the (EPDiff) mean fields limit of trained ResNet parameters as the number of data points goes to infinity. The search for good architectures can be reduced to that of good kernels, and we show that the composition of warping regression blocks with reduced equivariant multichannel kernels (introduced here) recovers and generalizes CNNs to arbitrary spaces and groups of transformations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Do ideas have shape? Idea registration as the continuous limit of artificial neural networks

Abstract

Talk to us

Similar Papers

More From: Physica D: Nonlinear Phenomena

Lead the way for us

Similar Papers

Generalization of the Kimeldorf-Wahba correspondence for constrained interpolation
Xavier Bay ... Hassan Maatouk
Electronic Journal of Statistics | VOL. 10
Xavier Bay, et. al.Xavier Bay ... Hassan Maatouk
01 Jan 2015
Electronic Journal of Statistics | VOL. 10

On the Number of Exits Across the Boundary of a Region by a Vector Stochastic Process
Yu K Belyaev
Theory of Probability & Its Applications | VOL. 13
Yu K BelyaevYu K Belyaev
01 Jan 1968
Theory of Probability & Its Applications | VOL. 13

Kernel dependence regularizers and Gaussian processes with applications to algorithmic fairness
Zhu Li ... Dino Sejdinovic
Pattern Recognition | VOL. 132
Zhu Li, et. al.Zhu Li ... Dino Sejdinovic
26 Jul 2022
Pattern Recognition | VOL. 132

Randon-Nikodym Derivatives of Stationary Gaussian Measures
Jack Capon
The Annals of Mathematical Statistics | VOL. 35
Jack CaponJack Capon
01 Jun 1964
The Annals of Mathematical Statistics | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Do ideas have shape? Idea registration as the continuous limit of artificial neural networks

Abstract

Talk to us

Similar Papers

More From: Physica D: Nonlinear Phenomena