Deep double descent: where bigger models and more data hurt* *This article is an updated version of: Nakkiran P, Kaplun G, Bansal Y, Yang T, Barak B and Sutskever I 2020 Deep double descent: where bigger models and more data hurt Int. Conf. Learning Representations.

Preetum Nakkiran,Boaz Barak,Ilya Sutskever,Tristan Yang,Gal Kaplun,Yamini Bansal

doi:10.1088/1742-5468/ac3a74

Deep double descent: where bigger models and more data hurt* *This article is an updated version of: Nakkiran P, Kaplun G, Bansal Y, Yang T, Barak B and Sutskever I 2020 Deep double descent: where bigger models and more data hurt Int. Conf. Learning Representations.

Preetum Nakkiran, Boaz Barak + Show 4 more

Open Access

https://doi.org/10.1088/1742-5468/ac3a74

Copy DOI

Journal: Journal of Statistical Mechanics: Theory and Experiment	Publication Date: Dec 1, 2021
Citations: 148	License type: iop-standard

Affiliation: Harvard University Press, OpenAI (United States)

#Double Descent #Bigger Models + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We show that a variety of modern deep learning tasks exhibit a ‘double-descent’ phenomenon where, as we increase model size, performance first gets worse and then gets better. Moreover, we show that double descent occurs not just as a function of model size, but also as a function of the number of training epochs. We unify the above phenomena by defining a new complexity measure we call the effective model complexity and conjecture a generalized double descent with respect to this measure. Furthermore, our notion of model complexity allows us to identify certain regimes where increasing (even quadrupling) the number of train samples actually hurts test performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Journal of Statistical Mechanics: Theory and Experiment

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.