Toward Reliable Conformational Energies of Amino Acids and Dipeptides─The DipCONFS Benchmark and DipCONL Datasets.

Christoph Plett,Stefan Grimme,Andreas Hansen

doi:10.1021/acs.jctc.4c00801

Abstract

Simulating peptides and proteins is becoming increasingly important, leading to a growing need for efficient computational methods. These are typically semiempirical quantum mechanical (SQM) methods, force fields (FFs), or machine-learned interatomic potentials (MLIPs), all of which require a large amount of accurate data for robust training and evaluation. To assess potential reference methods and complement the available data, we introduce two sets, DipCONFL and DipCONFS, which cover large parts of the conformational space of 17 amino acids and their 289 possible dipeptides in aqueous solution. The conformers were selected from the exhaustive PeptideCS dataset by Andris et al. [ J. Phys. Chem. B 2022, 126, 5949-5958]. The structures, originally generated with GFN2-xTB, were reoptimized using the accurate r2SCAN-3c density functional theory (DFT) composite method including the implicit CPCM water solvation model. The DipCONFS benchmark set contains 918 conformers and is one of the largest sets with highly accurate coupled cluster conformational energies so far. It is employed to evaluate various DFT and wave function theory (WFT) methods, especially regarding whether they are accurate enough to be used as reliable reference methods for larger datasets intended for training and testing more approximated SQM, FF, and MLIP methods. The results reveal that the originally provided BP86-D3(BJ)/DGauss-DZVP conformational energies are not sufficiently accurate. Among the DFT methods tested as an alternative reference level, the revDSD-PBEP86-D4 double hybrid performs best with a mean absolute error (MAD) of 0.2 kcal mol-1 compared with the PNO-LCCSD(T)-F12b reference. The very efficient r2SCAN-3c composite method also shows excellent results, with an MAD of 0.3 kcal mol-1, similar to the best-tested hybrid ωB97M-D4. With these findings, we compiled the large DipCONFL set, which includes over 29,000 realistic conformers in solution with reasonably accurate r2SCAN-3c reference conformational energies, gradients, and further properties potentially relevant for training MLIP methods. This set, also in comparison to DipCONFS, is used to assess the performance of various SQM, FF, and MLIP methods robustly and can complement training sets for those.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Toward Reliable Conformational Energies of Amino Acids and Dipeptides─The DipCONFS Benchmark and DipCONL Datasets.

Abstract

Talk to us

Similar Papers

More From: Journal of chemical theory and computation

Lead the way for us

Similar Papers

Benchmarking Semiempirical Methods for Thermochemistry, Kinetics, and Noncovalent Interactions: OMx Methods Are Almost As Accurate and Robust As DFT-GGA Methods for Organic Molecules.
Martin Korth ... Walter Thiel
Journal of chemical theory and computation | VOL. 7
Martin Korth, et. al.Martin Korth ... Walter Thiel
16 Aug 2011
Journal of chemical theory and computation | VOL. 7

Error estimates for (semi-)empirical dispersion terms and large biomacromolecules
Martin Korth
Organic & Biomolecular Chemistry | VOL. 11
Martin KorthMartin Korth
01 Jan 2013
Organic & Biomolecular Chemistry | VOL. 11

Improved Polarizable Dipole-Dipole Interaction Model for Hydrogen Bonding, Stacking, T-Shaped, and X-H···π Interactions.
Xi-Chan Gao ... Qiang Hao
Journal of Chemical Theory and Computation | VOL. 13
Xi-Chan Gao, et. al.Xi-Chan Gao ... Qiang Hao
19 May 2017
Journal of Chemical Theory and Computation | VOL. 13

Chapter 24 - Improving semiempirical quantum mechanical methods with machine learning
Pavlo O Dral ... Tetiana Zubatiuk
Quantum Chemistry in the Age of Machine Learning | VOL. -
Pavlo O Dral, et. al.Pavlo O Dral ... Tetiana Zubatiuk
23 Sep 2022
Quantum Chemistry in the Age of Machine Learning | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Toward Reliable Conformational Energies of Amino Acids and Dipeptides─The DipCONFS Benchmark and DipCONL Datasets.

Abstract

Talk to us

Similar Papers

More From: Journal of chemical theory and computation