Abstract
Metabolomics is the study of small molecules, called metabolites, of a cell, tissue or organism. It is of particular interest as endogenous metabolites represent the phenotype resulting from gene expression. A major challenge in metabolomics research is the structural identification of unknown biochemical compounds in complex biofluids. In this paper we present an efficient cheminformatics tool, BioSMXpress that uses known endogenous mammalian biochemicals and graph matching methods to identify endogenous mammalian biochemical structures in chemical structure space. The results of a comprehensive set of empirical experiments suggest that BioSMXpress identifies endogenous mammalian biochemical structures with high accuracy. BioSMXpress is 8 times faster than our previous work BioSM without compromising the accuracy of the predictions made. BioSMXpress is freely available at http://engr.uconn.edu/~rajasek/BioSMXpress.zip
Highlights
Metabolomics is the comprehensive, qualitative, and quantitative study of all the small molecules, called metabolites, in an organism [1]
In this paper we propose an efficient cheminformatics tool, which can be used to identify biological compounds based on their molecular structures, called BioSMXpress
Similar to our previous work, the prediction method applied by BioSMXpress relies on a set of endogenous mammalian biochemical compounds obtained from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database, hereafter referred to as scaffolds
Summary
Metabolomics is the comprehensive, qualitative, and quantitative study of all the small molecules, called metabolites, in an organism [1]. The existence of several on-line chemical structure databases has provided a vital support for molecular identification by allowing the search for candidate compounds using experimentally determined features with computationally simulated features. Cheminformatics methods are needed to efficiently search such large chemical databases and potentially identify unknown endogenous biochemical compounds. This threshold is based on the number of atoms in both the query structure and the scaffold being examined Knowing this gives us the opportunity to avoid the need to exhaustively search the entire scaffolds list before making a decision about the query compound with confidence. To do this efficiently, BioSMXpress selects the scaffolds that have the potential to promote the candidate in the least possible time. Only those scaffolds, with enough atoms to satisfy the given threshold, are checked against the candidate compound for similarity
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.