Abstract

We introduce the ForceGen method for 3D structure generation and conformer elaboration of drug-like small molecules. ForceGen is novel, avoiding use of distance geometry, molecular templates, or simulation-oriented stochastic sampling. The method is primarily driven by the molecular force field, implemented using an extension of MMFF94s and a partial charge estimator based on electronegativity-equalization. The force field is coupled to algorithms for direct sampling of realistic physical movements made by small molecules. Results are presented on a standard benchmark from the Cambridge Crystallographic Database of 480 drug-like small molecules, including full structure generation from SMILES strings. Reproduction of protein-bound crystallographic ligand poses is demonstrated on four carefully curated data sets: the ConfGen Set (667 ligands), the PINC cross-docking benchmark (1062 ligands), a large set of macrocyclic ligands (182 total with typical ring sizes of 12–23 atoms), and a commonly used benchmark for evaluating macrocycle conformer generation (30 ligands total). Results compare favorably to alternative methods, and performance on macrocyclic compounds approaches that observed on non-macrocycles while yielding a roughly 100-fold speed improvement over alternative MD-based methods with comparable performance.

Highlights

  • We introduce a new method for 3D structure generation and conformational elaboration that does not rely on distance geometry, precalculated molecular templates, or stochastic sampling

  • We present results on a standard benchmark of 480 druglike small molecules from the Cambridge Crystallographic Database used for validation of the OMEGA method [3, 14], 667 molecules from the MacroModel ConfGen validation study [15], 1062 ligands from the PINC cross-docking benchmark with deep representation of ten pharmaceutically relevant targets [9], 182 macrocyclic ligands from protein-ligand complexes curated from the PDB, and 30 macrocyclic ligands that form a commonly used benchmark originally reported by Chen and Foloppe [11]

  • Five data sets were studied in the course of this work, each addressing a different aspect of 3D structure generation and conformer generation and comparison to other approaches

Read more

Summary

Introduction

We introduce a new method for 3D structure generation and conformational elaboration that does not rely on distance geometry, precalculated molecular templates, or stochastic sampling. Structure generation and conformer elaboration for the ForceGen method each require seconds per molecule for non-macrocyclic drug-like ligands. Data have been collected that allow for fair and direct comparisons between the methods reported here and widely used alternatives This is challenging for three reasons: (1) some high-quality data sources prohibit redistribution of molecular structural data, necessitating re-acquisition; (2) many methodological developers and evaluators choose to provide only PDB and ligand HET codes, or only PDB codes with no indication of ligand identity, necessitating inferences as to ligand bond orders, tautomer states, formal charges, and even which ligand might be meant; and (3) molecular file format and conversion utilities may introduce noise into the data, most commonly by producing incorrect annotations of chiral atoms and configurations of double bonds.

Quality measurements are calculated:
Computational procedures and statistical analysis
Results and discussion
Conclusions
Compliance with ethical standards
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call