Parametrisation Independence of the Natural Gradient in Overparametrised Systems

Jesse Van Oostrum,Nihat Ay

doi:10.1007/978-3-030-80209-7_78

Abstract

AbstractIn this paper we study the natural gradient method for overparametrised systems. This method is based on the natural gradient field which is invariant with respect to coordinate transformations. One calculates the natural gradient of a function on the manifold by multiplying the ordinary gradient of the function by the inverse of the Fisher Information Matrix (FIM). In overparametrised models, the FIM is degenerate and therefore one needs to use a generalised inverse. We show explicitly that using a generalised inverse, and in particular the Moore-Penrose inverse, does not affect the parametrisation independence of the natural gradient. Furthermore, we show that for singular points on the manifold the parametrisation independence is not even guaranteed for non-overparametrised models.KeywordsNatural gradientRiemannian metricDeep learningInformation geometry

Full Text