Adjoint-based exact Hessian computation

Shin-Ichi Ito,Yuto Miyatake,Takeru Matsuda

doi:10.1007/s10543-020-00833-0

Abstract

We consider a scalar function depending on a numerical solution of an initial value problem, and its second-derivative (Hessian) matrix for the initial value. The need to extract the information of the Hessian or to solve a linear system having the Hessian as a coefficient matrix arises in many research fields such as optimization, Bayesian estimation, and uncertainty quantification. From the perspective of memory efficiency, these tasks often employ a Krylov subspace method that does not need to hold the Hessian matrix explicitly and only requires computing the multiplication of the Hessian and a given vector. One of the ways to obtain an approximation of such Hessian-vector multiplication is to integrate the so-called second-order adjoint system numerically. However, the error in the approximation could be significant even if the numerical integration to the second-order adjoint system is sufficiently accurate. This paper presents a novel algorithm that computes the intended Hessian-vector multiplication exactly and efficiently. For this aim, we give a new concise derivation of the second-order adjoint system and show that the intended multiplication can be computed exactly by applying a particular numerical method to the second-order adjoint system. In the discussion, symplectic partitioned Runge–Kutta methods play an essential role.

Highlights

We consider an initial value problem of a d-dimensional time-dependent vector x driven by an ordinary differential equation (ODE) of the form d x(t; θ ) = f (x(t; θ )), x(0; θ ) = θ, (1.1)dt where t is time, the function f : Rd → Rd is assumed to be sufficiently differentiable, and θ is an initial value of x
We have shown a concise derivation of the second-order adjoint system, and a procedure for computing a matrix-vector multiplication exactly, where the matrix is the Hessian of a function of the numerical solution of an initial value problem with respect to the initial value
The fact that the second-order adjoint system can be reformulated to a part of a large adjoint system is the key point to obtain the exact Hessian-vector multiplication based on the Sanz-Serna scheme

Summary

Introduction

As is the case with the gradient, the simplest way of obtaining all elements of the Hessian is to integrate the system (1.1) multiple times for perturbed initial value This approach is noticeably expensive, and further may suffer from the discretization error. Focusing on Runge–Kutta methods and their numerical solutions, we shall propose an algorithm that computes the Hessian-vector multiplication (Hθ C(xN (θ )))γ exactly For this aim, we give a new concise derivation of the second-order adjoint system, which makes it possible to discuss the second-order adjoint system within the framework of the conventional (first-order) adjoint system and to apply the technique [15] to the second-order adjoint system.

Preliminaries

Adjoint method

Exact gradient calculation

Hessian-vector multiplication

Concise derivation of the second-order adjoint system

Exact Hessian-vector multiplication

Actual computation procedure

Numerical verification

Simple pendulum

Allen–Cahn equation

Wave equation

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bit Numerical Mathematics	Publication Date: Feb 17, 2021
Citations: 6	License type: open-access

R Discovery Prime

R Discovery Prime

Adjoint-based exact Hessian computation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bit Numerical Mathematics

Lead the way for us

Similar Papers

Stochastic Galerkin method and port-Hamiltonian form for linear dynamical systems of second order
Roland Pulch
Mathematics and Computers in Simulation | VOL. 216
Roland PulchRoland Pulch
13 Sep 2023
Mathematics and Computers in Simulation | VOL. 216

A study of second-order state-space systems of time-invariant and time-varying transistor circuits using the STWS technique
K Murugesan ... E C Henry Amirtharaj
International Journal of Electronics | VOL. 89
K Murugesan, et. al.K Murugesan ... E C Henry Amirtharaj
01 Apr 2002
International Journal of Electronics | VOL. 89

Direct parametric control of under-actuated second-order nonlinear systems
Jian Liu
-
Jian LiuJian Liu
01 May 2017
01 May 2017

Second-order Krylov subspace and arnoldi procedure
Zhao-Jun Bai ... Yang-Feng Su
Journal of Shanghai University (English Edition) | VOL. 8
Zhao-Jun Bai, et. al.Zhao-Jun Bai ... Yang-Feng Su
01 Dec 2004
Journal of Shanghai University (English Edition) | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Adjoint-based exact Hessian computation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bit Numerical Mathematics