Abstract

We consider a scalar function depending on a numerical solution of an initial value problem, and its second-derivative (Hessian) matrix for the initial value. The need to extract the information of the Hessian or to solve a linear system having the Hessian as a coefficient matrix arises in many research fields such as optimization, Bayesian estimation, and uncertainty quantification. From the perspective of memory efficiency, these tasks often employ a Krylov subspace method that does not need to hold the Hessian matrix explicitly and only requires computing the multiplication of the Hessian and a given vector. One of the ways to obtain an approximation of such Hessian-vector multiplication is to integrate the so-called second-order adjoint system numerically. However, the error in the approximation could be significant even if the numerical integration to the second-order adjoint system is sufficiently accurate. This paper presents a novel algorithm that computes the intended Hessian-vector multiplication exactly and efficiently. For this aim, we give a new concise derivation of the second-order adjoint system and show that the intended multiplication can be computed exactly by applying a particular numerical method to the second-order adjoint system. In the discussion, symplectic partitioned Runge–Kutta methods play an essential role.

Highlights

  • We consider an initial value problem of a d-dimensional time-dependent vector x driven by an ordinary differential equation (ODE) of the form d x(t; θ ) = f (x(t; θ )), x(0; θ ) = θ, (1.1)dt where t is time, the function f : Rd → Rd is assumed to be sufficiently differentiable, and θ is an initial value of x

  • We have shown a concise derivation of the second-order adjoint system, and a procedure for computing a matrix-vector multiplication exactly, where the matrix is the Hessian of a function of the numerical solution of an initial value problem with respect to the initial value

  • The fact that the second-order adjoint system can be reformulated to a part of a large adjoint system is the key point to obtain the exact Hessian-vector multiplication based on the Sanz-Serna scheme

Read more

Summary

Introduction

As is the case with the gradient, the simplest way of obtaining all elements of the Hessian is to integrate the system (1.1) multiple times for perturbed initial value This approach is noticeably expensive, and further may suffer from the discretization error. Focusing on Runge–Kutta methods and their numerical solutions, we shall propose an algorithm that computes the Hessian-vector multiplication (Hθ C(xN (θ )))γ exactly For this aim, we give a new concise derivation of the second-order adjoint system, which makes it possible to discuss the second-order adjoint system within the framework of the conventional (first-order) adjoint system and to apply the technique [15] to the second-order adjoint system.

Preliminaries
Adjoint method
Exact gradient calculation
Hessian-vector multiplication
Concise derivation of the second-order adjoint system
Exact Hessian-vector multiplication
Actual computation procedure
Numerical verification
Simple pendulum
Allen–Cahn equation
Wave equation
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call