Exact solutions of a deep linear network *

Liu Ziyin,Botao Li,Xiangming Meng

doi:10.1088/1742-5468/ad01b3

Exact solutions of a deep linear network *

Liu Ziyin, Botao Li + Show 1 more

Open Access

https://doi.org/10.1088/1742-5468/ad01b3

Copy DOI

Journal: Journal of Statistical Mechanics: Theory and Experiment	Publication Date: Nov 1, 2023
License type: iop-standard

Affiliation: The University of Tokyo

#Hidden Layer #Stochastic Neurons + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This work finds the analytical expression for the global minima of a deep linear network with weight decay and stochastic neurons, a fundamental model for understanding the landscape of neural networks. Our result implies that the origin is a special point in the deep neural network loss landscape where highly nonlinear phenomenon emerge. We show that weight decay strongly interacts with the model architecture and can create bad minima at zero in a network with more than one hidden layer, qualitatively different from a network with only one hidden layer. Practically, our result implies that common deep learning initialization methods are generally insufficient to ease the optimization of neural networks.

Full Text