Abstract

Hierarchical learning machines used in information science have singularities in their parameter spaces. At singularities, the Fisher information matrix becomes degenerate, resulting that the learning theory of regular statistical models can not be applied. Recently, it was proven that, if the true parameter is contained in the singularities, then the generalization error in Bayes estimation is far smaller than those of regular statistical models. In this paper, under the condition that the true parameter is not contained in singularities, we show two results: (1) if the dimension of the parameter from inputs to hidden units is not larger than three, then there exits a region of true parameters such that the generalization error is larger than those of regular models; and (2) if the dimension is larger than three, then, for an arbitrary true parameter, the generalization error is smaller than those of regular models.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.