Abstract
We characterize and remedy a failure mode that may arise from multi-scale dynamics with scale imbalances during training of deep neural networks, such as physics informed neural networks (PINNs). PINNs are popular machine-learning templates that allow for seamless integration of physical equation models with data. Their training amounts to solving an optimization problem over a weighted sum of data-fidelity and equation-fidelity objectives. Conflicts between objectives can arise from scale imbalances, heteroscedasticity in the data, stiffness of the physical equation, or from catastrophic interference during sequential training. We explain the training pathology arising from this and propose a simple yet effective inverse Dirichlet weighting strategy to alleviate the issue. We compare with Sobolev training of neural networks, providing the baseline of analytically ε-optimal training. We demonstrate the effectiveness of inverse Dirichlet weighting in various applications, including a multi-scale model of active turbulence, where we show orders of magnitude improvement in accuracy and convergence over conventional PINN training. For inverse modeling using sequential training, we find that inverse Dirichlet weighting protects a PINN against catastrophic forgetting.
Highlights
Data-driven modeling has emerged as a powerful and complementary approach to first-principles modeling
We characterize and remedy a failure mode that may arise from multi-scale dynamics with scale imbalances during training of deep neural networks, such as Physics Informed Neural Networks (PINNs)
It is straightforward to show that the training of a PINN amounts to a Multi-Task Learning (MTL) probmax/avg inverse Dirichlet lem, which is sensitive to the choice of regularization weights
Summary
Data-driven modeling has emerged as a powerful and complementary approach to first-principles modeling. The popular Physics Informed Neural Networks (PINNs) rely on knowing a differential equation model of the system in order to solve a soft-constrained optimization problem [5, 8]. We characterize training pathologies of PINNs and provide criteria for their occurrence We explain these pathologies by showing a connection between PINNs and Sobolev training, and we propose a strategy for loss weighting in PINNs based on the Dirichlet energy of the task-specific gradients. We show that this strategy reduces optimization bias and protects against catastrophic forgetting. We evaluate the proposed inverseDirichlet weighting by comparing with Sobolev training in a case where provably optimal weights can be derived, and by empirical comparison with two conceptually different state-of-the-art PINN weighting approaches
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have