The Hessian by blocks for neural network by backward propagation

Radhia Bessi,Nabil Gmati

doi:10.1080/16583655.2024.2327102

Radhia Bessi, Nabil Gmati

Open Access

PDF Available

https://doi.org/10.1080/16583655.2024.2327102

Copy DOI

Export

Save

Cite

Journal: Journal of Taibah University for Science	Publication Date: Apr 23, 2024
License type: CC BY-NC 4.0

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

The back-propagation algorithm used with a stochastic gradient and the increase in computer performance are at the origin of the recent Deep learning trend. For some problems, however, the convergence of gradient methods is still very slow. Newton's method offers potential advantages in terms of faster convergence. This method uses the Hessian matrix to guide the optimization process but increases the computational cost at each iteration. Indeed, although the expression of the Hessian matrix is explicitly known, previous work did not propose an efficient algorithm for its fast computation. In this work, we first propose a backward algorithm to compute the exact Hessian matrix. In addition, the introduction of original operators, for the calculation of second derivatives, facilitates the reading and allows the parallelization of the backward-looking algorithm. To study the practical performance of Newton's method, we apply the proposed algorithm to train two classical neural networks for regression and classification problems and display the associated numerical results.

Full Text