Abstract
A hierarchical neural network usually has many singular regions in the parameter space due to the degeneration of hidden units. Here, we focus on a three-layer perceptron, which has one-dimensional singular regions comprising both attractive and repulsive parts. Such a singular region is often called a Milnor-like attractor. It is empirically known that in the vicinity of a Milnor-like attractor, several parameters converge much faster than the rest and that the dynamics can be reduced to smaller-dimensional ones. Here we give a rigorous proof for this phenomenon based on a center manifold theory. As an application, we analyze the reduced dynamics near the Milnor-like attractor and study the stochastic effects of the online learning.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have