Abstract

In untargeted MS studies involving metabolomics the proportion of unknown or unidentifiable compounds (i.e. features) detected can often be >90%. Given that the proper identification of a true unknown can take many months or years of work, it is little wonder that few investigators are willing to undertake the task of rigorously identifying these unknowns. While experimental techniques such as suspect screening can lead to the occasional “lucky” hit, a more rapid and robust approach is needed for unknown identification. In this presentation I will introduce the concept of in silico metabolomics. This is a computational approach to unknown identification that combines the extensive knowledge of known compounds with the existing knowledge of how compounds are chemically or biologically transformed. In silico metabolomics fundamentally requires a large collection of known structures. Over the past 10 years we have created a number of compound databases that catalogue the known compounds, including human metabolites (HMDB), food constituents (FooDB), drugs (DrugBank), plant products (PhytoBank) and contaminants (ContaminantDB). We have also developed a software package called BioTransformer, that uses expert-knowledge combined with machine learning to accurately predict the biological and chemical transformations that known compounds may undergo in humans and in the environment. This software has been used to create a database called BioTranformerDB consisting of several million “biologically feasible” structures. By exploiting several in-house tools for accurate MS/MS and NMR spectral prediction we have been able to calculate the MS/MS and NMR spectra for all of the compounds in BioTransformerDB. Using these newly developed software tools and resources for in silico metabolomics, I will show how unknown compounds may be identified from untargeted MS studies. Video from the Keynote Speaker Dr. David S. Wishart can be found: https://www.youtube.com/watch?v=CAU_cWPtNHQ&feature=youtu.be

Highlights

  • “...there are known unknowns; that is to say we know there are some things we do not know

  • Using larger databases (PubChem, ChemSpider) and m/z matching is leading to many, many false positives

Read more

Summary

Computational Tools for the Identification of Unknowns

David Wishart, University of Alberta 3rd International Electronic Conference on Metabolomics Nov. 15-30, 2018. There are unknown unknowns – the ones we don't know we don't know.”

Levels of Metabolite ID for Untargeted Metabolomics
How Well Do We Do?
Why Are We Doing So Badly?
UofA Metabolomics Databases
Phase II Human Gut Microbial
BioTransformer Meteor
BioTransformer Updates
Lipid Blast
Findings
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.