Stochastic Gradient Descent and Anomaly of Variance-Flatness Relation in Artificial Neural Networks

Xia Xiong,Yong-Cong Chen,Chunxiao Shi,Ping Ao

doi:10.1088/0256-307x/40/8/080202

Xia Xiong, Yong-Cong Chen + Show 2 more

Open Access

https://doi.org/10.1088/0256-307x/40/8/080202

Copy DOI

Journal: Chinese Physics Letters	Publication Date: Jul 19, 2023
Citations: 1	License type: iop-standard

Affiliation: Shanghai University, Sichuan University

Abstract

Stochastic gradient descent (SGD), a widely used algorithm in deep-learning neural networks, has attracted continuing research interests for the theoretical principles behind its success. A recent work reported an anomaly (inverse) relation between the variance of neural weights and the landscape flatness of the loss function driven under SGD [Feng Y and Tu Y Proc. Natl. Acad. Sci. USA 118 e2015617118 (2021)]. To investigate this seeming violation of statistical physics principle, the properties of SGD near fixed points are analyzed with a dynamic decomposition method. Our approach recovers the true “energy” function under which the universal Boltzmann distribution holds. It differs from the cost function in general and resolves the paradox raised by the the anomaly. The study bridges the gap between the classical statistical mechanics and the emerging discipline of artificial intelligence, with potential for better algorithms to the latter.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Stochastic Gradient Descent and Anomaly of Variance-Flatness Relation in Artificial Neural Networks

Abstract

Talk to us

Similar Papers

More From: Chinese Physics Letters

Lead the way for us

Similar Papers

Accurate Weather Forecasting for Rainfall Prediction Using Artificial Neural Network Compared with Deep Learning Neural Network
D Vasudeva Rayudu ... J Femila Roseline
-
D Vasudeva Rayudu, et. al.D Vasudeva Rayudu ... J Femila Roseline
05 Jan 2023
05 Jan 2023

Drought Modelling Based on Artificial Intelligence and Neural Network Algorithms: A Case Study in Queensland, Australia
Kavina Dayal ... Ravinesh Deo
-
Kavina Dayal, et. al.Kavina Dayal ... Ravinesh Deo
01 Jan 2017
01 Jan 2017

Text Complexity Analysis of Chinese and foreign academic English writing via mobile devices based on neural network and deep learning
Qiucheng Liu
Library Hi Tech | VOL. 41
Qiucheng LiuQiucheng Liu
17 May 2022
Library Hi Tech | VOL. 41

Anomalous diffusion dynamics of learning in deep neural networks
Guozhang Chen ... Pulin Gong
Neural Networks | VOL. 149
Guozhang Chen, et. al.Guozhang Chen ... Pulin Gong
03 Feb 2022
Neural Networks | VOL. 149

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Stochastic Gradient Descent and Anomaly of Variance-Flatness Relation in Artificial Neural Networks

Abstract

Talk to us

Similar Papers

More From: Chinese Physics Letters