The inverse problem of supervised reconstruction of depth-variable (time-dependent) parameters in ordinary differential equations is considered, with the typical application of finding weights of a neural ordinary differential equation (NODE) for a residual network with time continuous layers. The differential equation is treated as an abstract and isolated entity, termed a standalone NODE (sNODE), to facilitate for a wide range of applications. The proposed parameter reconstruction is performed by minimizing a cost functional covering a variety of loss functions and penalty terms. Regularization via penalty terms is incorporated to enhance ethical and trustworthy AI formulations. A nonlinear conjugate gradient mini-batch optimization scheme (NCG) is derived for the training having the benefit of including a sensitivity problem. The model (differential equation)-based approach is thus combined with a data-driven learning procedure. Mathematical properties are stated for the differential equation and the cost functional. The adjoint problem needed is derived together with the sensitivity problem. The sensitivity problem itself can estimate changes in the output under perturbation of the trained parameters. To preserve smoothness during the iterations, the Sobolev gradient is calculated and incorporated. Numerical results are included to validate the procedure for a NODE and synthetic datasets and compared with standard gradient approaches. For stability, using the sensitivity problem, a strategy for adversarial attacks is constructed, and it is shown that the given method with Sobolev gradients is more robust than standard approaches for parameter identification.
Read full abstract