Abstract

We propose a computational workflow to design novel drug-like molecules by combining the global optimization of molecular properties and protein-ligand docking with machine learning. However, most existing methods depend heavily on experimental data, and many targets do not have sufficient data to train reliable activity prediction models. To overcome this limitation, protein-ligand docking calculations must be performed using the limited data available. Such docking calculations during molecular generation require considerable computational time, preventing extensive exploration of the chemical space. To address this problem, we trained a machine-learning-based model that predicted the docking energy using SMILES to accelerate the molecular generation process. Docking scores could be accurately predicted using only a SMILES string. We combined this docking score prediction model with the global molecular property optimization approach, MolFinder, to find novel molecules exhibiting the desired properties with high values of predicted docking scores. We named this design approach V-dock. Using V-dock, we efficiently generated many novel molecules with high docking scores for a target protein, a similarity to the reference molecule, and desirable drug-like and bespoke properties, such as QED. The predicted docking scores of the generated molecules were verified by correlating them with the actual docking scores.

Highlights

  • Effective drug design requires optimizing various physicochemical properties, such as molecular weight, log P, number of hydrogen-bonded donor-acceptors, the polar surface area of molecules, and affinity with a target protein [1]

  • The Pearson correlation coefficient of the training set was 0.85, and that of the whole set was 0.89. These results showed that the docking energy could be predicted accurately using only the SMILES representation

  • The distributions of the predicted docking energy, quantitative estimation of drug-likeness (QED), and the similarity of the generated molecules to ML216 confirm that the targeted properties are optimized compared to those present in the existing SureChEMBL dataset

Read more

Summary

Introduction

Effective drug design requires optimizing various physicochemical properties, such as molecular weight, log P, number of hydrogen-bonded donor-acceptors, the polar surface area of molecules, and affinity with a target protein [1]. V-dock uses machine-learning algorithms to predict the docking score, followed by the application of the global molecular property optimization algorithm, MolFinder, to optimize the score [2] We show that this new workflow significantly reduces the required computational resources compared to conventional approaches based on ligand docking and facilitates the design of novel drug-like molecules with the desired properties. Boitreaud et al [23] suggested the OptiMol approach based on binding energy optimization for drug design using a generative model and docking using adaptive sampling (CbAS) [24] to maximize the objective function They showed that the method finds compounds with high affinity for a given target protein more efficiently than reinforcement-based algorithms. This suggests that our V-docking approach is a promising and generalized approach for generating novel molecules that satisfy the specific properties of interest

Docking Energy Prediction Model Training Result
Optimization Results for Generated Molecules
Molecular Validation Designed with V-Dock
Possible Limitations of the V-Dock Approach
Drug-likeness of the Generated Molecule
Protein-Ligand Docking Energy Calculations
Docking Score Prediction Model
Objective Function
Molecular Generation Using MolFinder
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.