Approximate Fisher Information Matrix to Characterize the Training of Deep Neural Networks.

Zhibin Liao,Tom Drummond,Ian Reid,Gustavo Carneiro

doi:10.1109/tpami.2018.2876413

Abstract

In this paper, we introduce a novel methodology for characterizing the performance of deep learning networks (ResNets and DenseNet) with respect to training convergence and generalization as a function of mini-batch size and learning rate for image classification. This methodology is based on novel measurements derived from the eigenvalues of the approximate Fisher information matrix, which can be efficiently computed even for high capacity deep models. Our proposed measurements can help practitioners to monitor and control the training process (by actively tuning the mini-batch size and learning rate) to allow for good training convergence and generalization. Furthermore, the proposed measurements also allow us to show that it is possible to optimize the training process with a new dynamic sampling training approach that continuously and automatically change the mini-batch size and learning rate during the training process. Finally, we show that the proposed dynamic sampling training approach has a faster training time and a competitive classification accuracy compared to the current state of the art.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Approximate Fisher Information Matrix to Characterize the Training of Deep Neural Networks.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Pattern Analysis and Machine Intelligence

Lead the way for us

Journal: IEEE Transactions on Pattern Analysis and Machine Intelligence	Publication Date: Oct 16, 2018
Citations: 55

Similar Papers

Demystifying Learning Rate Policies for High Accuracy Training of Deep Neural Networks
Yanzhao Wu ... Ling Liu
-
Yanzhao Wu, et. al.Yanzhao Wu ... Ling Liu
01 Dec 2019
01 Dec 2019

Neuroevolution in Deep Neural Networks: Current Trends and Future Challenges
Edgar Galvan ... Peter Mooney
IEEE Transactions on Artificial Intelligence | VOL. 2
Edgar Galvan, et. al.Edgar Galvan ... Peter Mooney
04 May 2021
IEEE Transactions on Artificial Intelligence | VOL. 2

Large-Scale Distributed Second-Order Optimization Using Kronecker-Factored Approximate Curvature for Deep Convolutional Neural Networks
Kazuki Osawa ... Rio Yokota
-
Kazuki Osawa, et. al.Kazuki Osawa ... Rio Yokota
01 Jun 2019
01 Jun 2019

Scalable and Practical Natural Gradient for Large-Scale Deep Learning.
Kazuki Osawa ... Chuan-Sheng Foo
IEEE transactions on pattern analysis and machine intelligence | VOL. 44
Kazuki Osawa, et. al.Kazuki Osawa ... Chuan-Sheng Foo
23 Jun 2020
IEEE transactions on pattern analysis and machine intelligence | VOL. 44

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Approximate Fisher Information Matrix to Characterize the Training of Deep Neural Networks.

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Pattern Analysis and Machine Intelligence