This paper focuses on the development of multifidelity modeling approaches using neural network surrogates, where training data arising from multiple model forms and resolutions are integrated to predict high-fidelity response quantities of interest at lower cost. We focus on the context of quantum chemistry and the integration of information from multiple levels of theory. Important foundations include the use of symmetry function-based atomic energy vector constructions as feature vectors for representing structures across families of molecules and single-fidelity neural network training capabilities that learn the relationships needed to map feature vectors to potential energy predictions. These foundations are embedded within several multifidelity topologies that decompose the high-fidelity mapping into model-based components, including sequential formulations that admit a general nonlinear mapping across fidelities and discrepancy-based formulations that presume an additive decomposition. Methodologies are first explored and demonstrated on a pair of simple analytical test problems and then deployed for potential energy prediction for C5H5 using B2PLYP-D3/6-311++G(d,p) for high-fidelity simulation data and Hartree-Fock 6-31G for low-fidelity data. For the common case of limited access to high-fidelity data, our computational results demonstrate that multifidelity neural network potential energy surface constructions achieve roughly an order of magnitude improvement, either in terms of test error reduction for equivalent total simulation cost or reduction in total cost for equivalent error.
Read full abstract