Abstract
BackgroundDiscerning the similarity between molecules is a challenging problem in drug discovery as well as in molecular biology. The importance of this problem is due to the fact that the biochemical characteristics of a molecule are closely related to its structure. Therefore molecular similarity is a key notion in investigations targeting exploration of molecular structural space, query-retrieval in molecular databases, and structure-activity modelling. Determining molecular similarity is related to the choice of molecular representation. Currently, representations with high descriptive power and physical relevance like 3D surface-based descriptors are available. Information from such representations is both surface-based and volumetric. However, most techniques for determining molecular similarity tend to focus on idealized 2D graph-based descriptors due to the complexity that accompanies reasoning with more elaborate representations.ResultsThis paper addresses the problem of determining similarity when molecules are described using complex surface-based representations. It proposes an intrinsic, spherical representation that systematically maps points on a molecular surface to points on a standard coordinate system (a sphere). Molecular surface properties such as shape, field strengths, and effects due to field super-positioningcan then be captured as distributions on the surface of the sphere. Surface-based molecular similarity is subsequently determined by computing the similarity of the surface-property distributions using a novel formulation of histogram-intersection. The similarity formulation is not only sensitive to the 3D distribution of the surface properties, but is also highly efficient to compute.ConclusionThe proposed method obviates the computationally expensive step of molecular pose-optimisation, can incorporate conformational variations, and facilitates highly efficient determination of similarity by directly comparing molecular surfaces and surface-based properties. Retrieval performance, applications in structure-activity modeling of complex biological properties, and comparisons with existing research and commercial methods demonstrate the validity and effectiveness of the approach.
Highlights
Introduction to molecular representations and descriptorsIn their simplest form, molecules can be represented using chemical formulae
An analysis of the results obtained in this step indicates that the accuracy of the proposed approach during query-retrieval is comparable to that of ISIS, even though the proposed method addresses the query-retrieval problem in a setting that involves molecular conformations, surface-properties, and superposition-based effects and is much more complex than the 2D structural motif-based search used in ISIS
We considered the problem of defining similarity between molecules based on complex surfacebased representations. Such representations capture the physics of the molecules better than commonly used molecular-graph-based approaches and can have significant relevance in molecular query-retrieval, similarity-based exploration of structural space, and structure-activity modelling
Summary
Introduction to molecular representations and descriptorsIn their simplest form, molecules can be represented using chemical formulae. Discerning the similarity between molecules is a challenging problem in drug discovery as well as in molecular biology The importance of this problem is due to the fact that the biochemical characteristics of a molecule are closely related to its structure. Representations with high descriptive power and physical relevance like 3D surface-based descriptors are available. Most techniques for determining molecular similarity tend to focus on idealized 2D graph-based descriptors due to the complexity that accompanies reasoning with more elaborate representations. Across all biological and pharmaceutical investigations, the discovery (or development) of molecules with desired biological activity is an important goal Efforts to attain this goal are strongly driven by the notion of molecular similarity because in general similar molecules tend to behave [1,2]. The last subsection introduces the problems associated with determining molecular similarity using complex 3D surface-based descriptors
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.