Abstract

We propose new invariant (the product of the corresponding primes for the ring size of each bond of an atom) as a simple unambiguous ring invariant of an atom that allows distinguishing symmetry classes in the highly symmetrical molecular graphs using traditional local and distance atom invariants. Also, we propose modifications of Weininger’s CANON algorithm to avoid its ambiguities (swapping and leveling ranks, incorrect determination of symmetry classes in non-aromatic annulenes, arbitrary selection of atom for breaking ties). The atomic ring invariant and the Modified CANON algorithm allow us to create a rigorous procedure for the generation of canonical SMILES which can be used for accurate and fast structural searching in large chemical databases.

Highlights

  • The perception of the symmetry of atoms in molecular graphs is an important problem for chemoinformatics

  • We propose an efficient graph invariant atom partitioning (GIAP) procedure, which in almost all cases of the molecular graphs gives the correct partition of the atoms of the molecular graph into symmetry classes, followed by the canonical code generation by automorphism permutation (CCAP) step that will guarantee the uniqueness of the canonical code for a given molecular graph

  • Symmetry perception We have the constant computational complexity for the determination of the local invariants and the chirality invariant of the atoms and the quadratic in a worstcase computational complexity for the determination of the atomic ring invariant: the count of the bonds in the molecular graphs is approximately equal to the count of atoms and we must find the shortest non-trivial path between the atoms incident with each bond by the breadth-first search with the linear complexity in a worstcase

Read more

Summary

Introduction

The perception of the symmetry of atoms in molecular graphs is an important problem for chemoinformatics. The correct determination of the symmetry classes for the atoms in a molecular graph is a basis for finding a canonical ordering of the atoms in a molecule, which in turn is necessary for generating a unique representation of the molecule [1]. This approach is used in the canonicalization algorithms for the linear representations of the molecular graphs like SMILES [2– 5] and InChI [4, 6]. 2. Algorithm for generation of unique SMILES notation.

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.