Learning curves of generic features maps for realistic datasets with a teacher-student model* *This article is an updated version of: Loureiro B, Gerbelot C, Cui H, Goldt S, Krzakala F, Mezard M and Zdeborová L 2021 Learning curves of generic features maps for realistic datasets with a teacher-student model Advances in Neural Information Processing Systems vol 34 ed M Ranzato,

Bruno Loureiro,Sebastian Goldt,Hugo Cui,Lenka Zdeborová,Marc Mézard,Cédric Gerbelot,Florent Krzakala

doi:10.1088/1742-5468/ac9825

Abstract

Teacher-student models provide a framework in which the typical-case performance of high-dimensional supervised learning can be described in closed form. The assumptions of Gaussian i.i.d. input data underlying the canonical teacher-student model may, however, be perceived as too restrictive to capture the behaviour of realistic data sets. In this paper, we introduce a Gaussian covariate generalisation of the model where the teacher and student can act on different spaces, generated with fixed, but generic feature maps. While still solvable in a closed form, this generalization is able to capture the learning curves for a broad range of realistic data sets, thus redeeming the potential of the teacher-student framework. Our contribution is then two-fold: first, we prove a rigorous formula for the asymptotic training loss and generalisation error. Second, we present a number of situations where the learning curve of the model captures the one of a realistic data set learned with kernel regression and classification, with out-of-the-box feature maps such as random projections or scattering transforms, or with pre-learned ones—such as the features learned by training multi-layer neural networks. We discuss both the power and the limitations of the framework.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Statistical Mechanics: Theory and Experiment	Publication Date: Nov 1, 2022
Citations: 15	License type: cc-by

R Discovery Prime

R Discovery Prime

Abstract

Talk to us

Similar Papers

More From: Journal of Statistical Mechanics: Theory and Experiment

Lead the way for us

Similar Papers

Generalization error rates in kernel regression: the crossover from the noiseless to noisy regime* *This article is an updated version of: Cui H, Loureiro B, Krzakala F and Zdeborová L 2021 Generalization error rates in kernel regression: the crossover from the noiseless to noisy regime Advances in Neural Information Processing Systems vol 34 ed M Ranzato, A Beygelzimer, Y Dauphin,
Hugo Cui ... Florent Krzakala
Journal of Statistical Mechanics: Theory and Experiment | VOL. 2022
Hugo Cui, et. al.Hugo Cui ... Florent Krzakala
01 Nov 2022
Journal of Statistical Mechanics: Theory and Experiment | VOL. 2022

Spectral bias and task-model alignment explain generalization in kernel regression and infinitely wide neural networks
Abdulkadir Canatar ... Blake Bordelon
Nature Communications | VOL. 12
Abdulkadir Canatar, et. al.Abdulkadir Canatar ... Blake Bordelon
18 May 2021
Nature Communications | VOL. 12

Generalization Error Rates in Kernel Regression: The Crossover from the Noiseless to Noisy Regime
...
arXiv (Cornell University) | VOL. -
, et. al. ...
31 May 2021
arXiv (Cornell University) | VOL. -

Globally optimal learning rates for multilayer neural networks
David Saad ... Magnus Rattray
Philosophical Magazine B | VOL. 77
David Saad, et. al.David Saad ... Magnus Rattray
01 May 1998
Philosophical Magazine B | VOL. 77

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Abstract

Talk to us

Similar Papers

More From: Journal of Statistical Mechanics: Theory and Experiment