Abstract

We introduce a fast algorithm for entrywise evaluation of the Gauss--Newton Hessian (GNH) matrix for the fully connected feed-forward neural network. The algorithm has a precomputation step and a sampling step. While it generally requires $\mathcal{O}(Nn)$ work to compute an entry (and the entire column) in the GNH matrix for a neural network with $N$ parameters and $n$ data points, our fast sampling algorithm reduces the cost to $\mathcal{O}(n+d/\epsilon^2)$ work, where $d$ is the output dimension of the network and $\epsilon$ is a prescribed accuracy (independent of $N$). One application of our algorithm is constructing the hierarchical-matrix ($\mathcal{H}$-matrix) approximation of the GNH matrix for solving linear systems and eigenvalue problems. It generally requires $\mathcal{O}(N^2)$ memory and $\mathcal{O}(N^3)$ work to store and factorize the GNH matrix, respectively. The $\mathcal{H}$-matrix approximation requires only $\mathcal{O}(N r_o)$ memory footprint and $\mathcal{O}(N r_o^2)$ work to be factorized, where $r_o \ll N$ is the maximum rank of off-diagonal blocks in the GNH matrix. We demonstrate the performance of our fast algorithm and the $\mathcal{H}$-matrix approximation on classification and autoencoder neural networks. (A corrected version is attached.)

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.