Understanding Surprising Generalization Phenomena in Deep Learning

Wei Hu

doi:10.1609/aaai.v38i20.30285

Abstract

Deep learning has exhibited a number of surprising generalization phenomena that are not captured by classical statistical learning theory. This talk will survey some of my work on the theoretical characterizations of several such intriguing phenomena: (1) Implicit regularization: A major mystery in deep learning is that deep neural networks can often generalize well despite their excessive expressive capacity. Towards explaining this mystery, it has been suggested that commonly used gradient-based optimization algorithms enforce certain implicit regularization which effectively constrains the model capacity. (2) Benign overfitting: In certain scenarios, a model can perfectly fit noisily labeled training data, but still archives near-optimal test error at the same time, which is very different from the classical notion of overfitting. (3) Grokking: In certain scenarios, a model initially achieves perfect training accuracy but no generalization (i.e. no better than a random predictor), and upon further training, transitions to almost perfect generalization. Theoretically establishing these properties often involves making appropriate high-dimensional assumptions on the problem as well as a careful analysis of the training dynamics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Understanding Surprising Generalization Phenomena in Deep Learning

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Similar Papers

On Generalization Bounds for Deep Networks Based on Loss Surface Implicit Regularization
Masaaki Imaizumi ... Johannes Schmidt-Hieber
IEEE Transactions on Information Theory | VOL. 69
Masaaki Imaizumi, et. al.Masaaki Imaizumi ... Johannes Schmidt-Hieber
01 Feb 2023
IEEE Transactions on Information Theory | VOL. 69

Comprehensive Study for Breast Cancer Using Deep Learning and Traditional Machine Learning
-
ZANCO JOURNAL OF PURE AND APPLIED SCIENCES | VOL. 34
--
12 Apr 2022
ZANCO JOURNAL OF PURE AND APPLIED SCIENCES | VOL. 34

Artificial intelligence in interdisciplinary life science and drug discovery research.
Jürgen Bajorath
Future science OA | VOL. 8
Jürgen BajorathJürgen Bajorath
08 Mar 2022
Future science OA | VOL. 8

Machine Learning and Deep Learning
Dietmar P F Möller
-
Dietmar P F MöllerDietmar P F Möller
01 Jan 2023
01 Jan 2023

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Understanding Surprising Generalization Phenomena in Deep Learning

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence