Abstract

Rational compound design remains a challenging problem for both computational methods and medicinal chemists. Computational generative methods have begun to show promising results for the design problem. However, they have not yet used the power of three-dimensional (3D) structural information. We have developed a novel graph-based deep generative model that combines state-of-the-art machine learning techniques with structural knowledge. Our method (“DeLinker”) takes two fragments or partial structures and designs a molecule incorporating both. The generation process is protein-context-dependent, utilizing the relative distance and orientation between the partial structures. This 3D information is vital to successful compound design, and we demonstrate its impact on the generation process and the limitations of omitting such information. In a large-scale evaluation, DeLinker designed 60% more molecules with high 3D similarity to the original molecule than a database baseline. When considering the more relevant problem of longer linkers with at least five atoms, the outperformance increased to 200%. We demonstrate the effectiveness and applicability of this approach on a diverse range of design problems: fragment linking, scaffold hopping, and proteolysis targeting chimera (PROTAC) design. As far as we are aware, this is the first molecular generative model to incorporate 3D structural information directly in the design process. The code is available at https://github.com/oxpig/DeLinker.

Highlights

  • Drug design is an iterative process that requires potential compounds to be optimized for specific properties, ranging from binding affinity to pharmacokinetics

  • We first checked the impact of the structural information and assessed our generative method in three experiments: (i) large-scale validation on ZINC, (ii) large-scale validation on CASF, and (iii) three case studies covering fragment linking,[49] scaffold hopping,[50] and proteolysis targeting chimera (PROTAC) design.[51]

  • To assess the importance of including structural information, we empirically examined its impact on the generation process (Table 1)

Read more

Summary

Introduction

Drug design is an iterative process that requires potential compounds to be optimized for specific properties, ranging from binding affinity to pharmacokinetics. This process is challenging, in part, due to the size of the search space[1] and discontinuous nature of the optimization landscape.[2] Typically, molecule design is undertaken by human experts and is a subjective process. Methods have been developed to generate molecules that follow the same distribution as the training set, whether a general set of molecules[10] such as ZINC17 or ChEMBL,[18] or a more focused one such as inhibitors for a particular protein target.[7,19] Second, generative models have been proposed to perform molecular optimization, taking an input molecule and attempting to modify one or several chemical properties, typically subject to a similarity constraint.[16,20,21]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call