Abstract

This paper focuses on determining the structural similarity of two molecules, i.e., the similarity of the interconnection of all the elementary cycles in the corresponding molecular graphs. In this paper, we propose and analyze an algorithmic approach based on the resolution of the Maximum Common Edge Subgraph (MCES) problem with graphs representing the interaction of cycles molecules. Using the ChEBI database, we compare the effectiveness of this approach in terms of structural similarity and computation time with two calculations of similarity of molecular graphs, one based on the MCES, the other on the use of different fingerprints (Daylight, ECFP4, ECFP6, FCFP4, FCFP6) to measure Tanimoto coefficient. We also analyze the obtained structural similarity results for a selected subset of molecules.

Highlights

  • MotivationThis article focuses on algorithmic approaches to compute the structural similarity of pairs of molecules in large molecular databases

  • We compare on real cases the performances of three approaches to compute structural similarity of molecules: two approaches using Maximum Common Edge Subgraph (MCES) (on molecular graphs (MG) and graph of cycles (GC)), and an approach dealing with molecular graphs based on fingerprints and using the Tanimoto coefficient [19] (TC)

  • We will show that graphs of cycles (GC) can capture the structural similarity of molecules; that MG does not consider cycles when the structural part is considered and that GC and Tanimoto Coefficient (TC) do not compute the same kind of similarity even if the results are sometimes similar

Read more

Summary

Introduction

MotivationThis article focuses on algorithmic approaches to compute the structural similarity of pairs of molecules in large molecular databases. In organic chemistry, when a new molecule is designed, it is necessary to determine chemical reactions that can be used to synthesize this target molecule from available compounds. Finding such chemical reactions usually consists in searching in a reaction database (such as REAXYS [1] or ChEBI [2]) for a molecule that is structurally close to the target molecule. Considering a modeling of molecules by graphs [5] or hypergraphs, several definitions and similarity approaches between

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.