In this work, we report the benchmark binding energies of the seven complexes within the L7 data set, six host-guest complexes from the S12L data set, a C60 dimer, the DNA-ellipticine intercalation complex, and the largest system of the study, the HIV-indinavir system, which contained 343 atoms or 139 heavy atoms. The high-quality values reported were obtained via a focal point method that relies on the canonical form of second-order Møller-Plesset theory and the domain-based local pair natural orbital scheme for the coupled cluster with single double and perturbative triple excitations [DLPNO-CCSD(T)] extrapolated to the complete basis set (CBS) limit. The results in this work not only corroborate but also improve upon some previous benchmark values for large noncovalent complexes albeit at a relatively steep cost. Although local CCSD(T) and the largely successful fixed-node diffusion Monte Carlo (FN-DMC) have been shown to generally agree for small- to medium-size systems, a discrepancy in their reported binding energy values arises for large complexes, where the magnitude of the disagreement is a definite cause for concern. For example, the largest deviation in the L7 data set was 2.8 kcal/mol (∼10%) on the low end in C3GC. Such a deviation only grows worse in the S12L set, which showed a difference of up to 10.4 kcal/mol (∼25%) by a conservative estimation in buckycatcher-C60. The DNA-ellipticine complex also generated a disagreement of 4.4 kcal/mol (∼10%) between both state-of-the-art methods. The disagreement between local CCSD(T) and FN-DMC in large noncovalent complexes shows that it is urgently needed to have the canonical CCSD(T), the Monte Carlo CCSD(T), or the full configuration interaction quantum Monte Carlo approaches available to large systems on the hundred-atom scale to solve this dilemma. In addition, the performances of cheaper popular computational methods were assessed for the studied complexes with respect to DLPNO-CCSD(T)/CBS. r2SCAN-3c, B97M-V, and PBE0+D4 work well in large noncovalent complexes in this work, and GFN2-xTB performs well in π-π stacking complexes. B97M-V is the most reliable computationally efficient approach to predicting noncovalent interactions for large complexes, being the only one to have binding errors within the so-called 1 kcal/mol "chemical accuracy". The benchmark interaction energies of these host-guest complexes, molecular materials, and biological systems with electronic and medicinal implications provide crucial reference data for the improvement of current and future lower-cost methods.
Read full abstract