Abstract

A framework is presented for the calculation of novel alignment-free descriptors of molecular shape. The methods are based on the technique of spectral geometry which has been developed in the field of computer vision where it has shown impressive performance for the comparison of deformable objects such as people and animals. Spectral geometry techniques encode shape by capturing the curvature of the surface of an object into a compact, information-rich representation that is alignment-free while also being invariant to isometric deformations, that is, changes that do not distort distances over the surface. Here, we adapt the technique to the new domain of molecular shape representation. We describe a series of parametrization steps aimed at optimizing the method for this new domain. Our focus here is on demonstrating that the basic approach is able to capture a molecular shape into a compact and information-rich descriptor. We demonstrate improved performance in virtual screening over a more established alignment-free method and impressive performance compared to a more accurate, but much more computationally demanding, alignment-based approach.

Highlights

  • The development of in silico methods for shape-based searching of small molecules has been a topic of considerable interest for many years.[1−3] This is due to shape being fundamental to molecular recognition events such as a drug binding to a biological receptor

  • The results show that the virtual screening results are insensitive to the number of nearest neighbors and the performance values are worse than the Hard Vector Quantization (HQ) encoding; they still perform better than the Soft Vector Quantization (SQ) encoding

  • We have described a framework for applying spectral geometry to the problem of molecular shape comparison for 3D virtual screening

Read more

Summary

Introduction

The development of in silico methods for shape-based searching of small molecules has been a topic of considerable interest for many years.[1−3] This is due to shape being fundamental to molecular recognition events such as a drug binding to a biological receptor. A key advantage of shape searching over 2D fragment-based methods is that it is more amenable to scaffold hopping, that is, finding hits that belong to different chemical series. This is important for drug discovery projects since it allows them to be moved into new areas of chemical space, increasing the chance of generating new intellectual property while mitigating against potential side effects or synthetic intractability associated with existing compounds. ROCS, Rapid Overlay of Chemical Structures,[4] the industry-standard alignment method, uses Gaussian functions to represent atomic volumes which allow the rapid calculation of the overlap volume of aligned molecules. The Enamine REAL database consists of 680 million compounds that are available for purchase through one-step synthesis, and the GDB-17 database of virtual compounds with up to 17 heavy atoms consists of 166 billion compounds.[8]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call