A Simplified Natural Gradient LearningAlgorithm

Shantanu Chakrabartty,Michael R Bastian,Todd K Moon,Jacob H Gunther

doi:10.1155/2011/407497

Shantanu Chakrabartty, Michael R Bastian + Show 2 more

Open Access

https://doi.org/10.1155/2011/407497

Copy DOI

Journal: Advances in Artificial Neural Systems	Publication Date: Jan 1, 2011
Citations: 17	License type: CC BY 3.0

Affiliation: Utah State University

Abstract

Adaptive natural gradient learning avoids singularities in the parameter space of multilayer perceptrons. However, it requires a larger number of additional parameters than ordinary backpropagation in the form of the Fisher information matrix. This paper describes a new approach to natural gradient learning that uses a smaller Fisher information matrix. It also uses a prior distribution on the neural network parameters and an annealed learning rate. While this new approach is computationally simpler, its performance is comparable to that of adaptive natural gradient learning.

Highlights

Amari et al developed the adaptive natural gradient learning (ANGL) algorithm for multilayer perceptrons [1,2,3]
The simplified natural gradient learning (SNGL) algorithm introduced in this paper uses a new formulation of the Fisher information matrix
For ANGL it is the learning rate multiplied by the smallest eigenvalue of the estimated Fisher information matrix

Summary

Introduction

Amari et al developed the adaptive natural gradient learning (ANGL) algorithm for multilayer perceptrons [1,2,3]. This section describes an exponential family in which each member of the family has a probability density function of network output errors This exponential family will be used to determine the direction of steepest descent and the natural gradient to be used in a learning algorithm. If the inner product of this tangent space is defined to be v, w = Ep{vw}, where Ep is the expectation operator with density function p, the Riemannian metric is the Fisher information matrix because the Fisher information between two score functions is gij = Ep{sisj} [9, 10, 12] These definitions are illustrated in the following example. Amari et al developed an algorithm called adaptive natural gradient learning (ANGL) [2] It uses the matrix inversion lemma [12] to perform a rank 1 update on the inverse of the Fisher information matrix at each step. The ANGL algorithm performs well, but it needs sufficient memory to store a large matrix and is sensitive to initial conditions and learning rates

A Simplified Natural Gradient Learning Algorithm

Experimental Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Simplified Natural Gradient LearningAlgorithm

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Advances in Artificial Neural Systems

Lead the way for us

Similar Papers

Adaptive natural gradient learning algorithms for various stochastic models
H Park ... K Fukumizu
Neural Networks | VOL. 13
H Park, et. al.H Park ... K Fukumizu
01 Sep 2000
Neural Networks | VOL. 13

Natural gradient learning algorithms for RBF networks.
Junsheng Zhao ... Kanjian Zhang
Neural Computation | VOL. 27
Junsheng Zhao, et. al.Junsheng Zhao ... Kanjian Zhang
07 Nov 2014
Neural Computation | VOL. 27

Natural gradient learning algorithms for nonlinear systems
Zhao Junsheng ... Xia Jianwei
-
Zhao Junsheng, et. al.Zhao Junsheng ... Xia Jianwei
01 Jul 2015
01 Jul 2015

<title>Linear zonal atmospheric prediction for adaptive optics</title>
Patrick C Mcguire ... Peter L Wizinowich
-
Patrick C Mcguire, et. al.Patrick C Mcguire ... Peter L Wizinowich
07 Jul 2000
07 Jul 2000

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Simplified Natural Gradient LearningAlgorithm

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Advances in Artificial Neural Systems