Abstract

Rigorous characterization of small organic molecules in terms of their structural and biological properties is vital to biomedical research. The three-dimensional structure of a molecule, its ‘photo ID’, is inefficient for searching and matching tasks. Instead, identifiers play a key role in accessing compound data. Unique and reproducible molecule and atom identifiers are required to ensure the correct cross-referencing of properties associated with compounds archived in databases. The best approach to this requirement is the International Chemical Identifier (InChI). However, the current implementation of InChI fails to provide a complete standard for atom nomenclature, and incorrect use of the InChI standard has resulted in the proliferation of non-unique identifiers. We propose a methodology and associated software tools, named ALATIS, that overcomes these shortcomings. ALATIS is an adaptation of InChI, which operates fully within the InChI convention to provide unique and reproducible molecule and all atom identifiers. ALATIS includes an InChI extension for unique atom labeling of symmetric molecules. ALATIS forms the basis for improving reproducibility and unifying cross-referencing across databases.

Highlights

  • Multiple types of experimental data on individual compounds reside in different databases: X-ray structures, nuclear magnetic resonance (NMR) data, mass spectroscopic (MS) data, pKa values, melting points, etc

  • The complete content of BMRB21 and HMDB22–24 relevant to metabolite entries provides a full demonstration of ALATIS

  • We summarize the five key capabilities of ALATIS below; Supplementary Information 1 contains additional details

Read more

Summary

Introduction

Some classified as ligands[1,2,3,4,5], fragments[6,7,8,9,10], or metabolites[11,12,13,14,15,16,17], play prominent roles in biomedical research, for example, in biomarker discovery, screening, and drug discovery[18,19,20]. The retrieval of reliable information relevant to molecules from different databases is dependent on their use of standard unique molecule and atom identifiers. Our investigation has revealed that these requirements for standardized unique molecule- and atom-level identifiers are not fully met in a variety of databases that contain information on organic molecules. These finding have prompted us to investigate approaches to such nomenclature and to propose a solution to their current deficiencies.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.