Abstract

Here, the Quantum Mechanical Properties of Drug-like Molecules (QMugs) dataset is expanded to facilitate its use as training data for surrogate machine learning models to predict quantum mechanical properties for tasks related to chemical reactivity. Small molecules from reaction databases as well as charged and boron-containing compounds from ChEMBL were added. Each of these compounds was passed through a pipeline of MMFF94s/UFF conformer generation, followed by GFN2-xTB optimization and finally a density functional theory single-point calculation at the ωB97X-D/def2-SVP level of theory. In total, 71,632 new molecules were evaluated in this manner. Steric (SASA) and dispersion (Pint) descriptors were computed at the semiempirical GFN2-xTB level of theory for the lowest energy conformer of all species in the enlarged QMugs dataset. The expanded dataset aims to facilitate the construction of surrogate models of much broader scope than the original QMugs dataset which was limited to biologically active compounds.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call