In this work, we utilize our recently developed machine learning (ML)-corrected abinitio dispersion (aiD) potential, known as D3-ML, which is based on the comprehensive SAPT10K dataset and relies solely on Cartesian coordinates as input, to address the dispersion deficiencies in second-order Møller-Plesset perturbation theory (MP2) by replacing its problematic dispersion and exchange-dispersion terms with D3-ML. This leads to the development of a new dispersion-corrected MP2 method, MP2+aiD(CCD), which outperforms other spin-component-scaled and dispersion-corrected MP2 methods as well as popular ML models for predicting noncovalent interactions across various datasets, including S66 × 8, NAP6 (containing 6 naphthalene dimers), L7, S12L, DNA-ellipticine, the C60 dimer, and C60[6]CPPA. In addition, MP2+aiD(CCD) exhibits comparable or even superior performance compared to the contemporary ωB97M-V functional. The limited performance of pure ML models for systems outside the training set or larger than those in the training set highlights their instability and unpredictability. Conversely, the outstanding performance and transferability of the hybrid MP2+aiD(CCD) method can be attributed to the fusion of the physical electronic structure method and a data-driven ML model, combining the strengths of both sides. This investigation firmly establishes MP2+aiD(CCD) as one of the most accurate and reliable fifth-order scaling correlated wave function methods currently available for modeling noncovalent interactions, even for large complexes. MP2+aiD(CCD) is expected to be reliably applicable in investigating real-life complexes at the hundred-atom scale.
Read full abstract