Abstract
Interpretation of nuclear magnetic resonance (NMR) experimental results for metabolomics studies requires intensive signal processing and multivariate data analysis techniques. A critical process in the typical workflow is the identification of significant metabolites, typically compiled post hoc. Current techniques rely on manual tuning and are built on databases (DBs) of pure compound samples, where the experimental conditions are simulated in the laboratory. Herein, we develop a novel metabolite identification algorithm utilizing a Bayes classifier with genetic algorithm (GA) feature selection built upon empirical spectroscopic data. This captures the inherent variability in experimental data, while greatly reducing the need to build DBs of pure compounds. The ability to annotate spectra by learning patterns within empirical data allows the metabolomics community to utilize existing datasets to improve and extend our method. The feasibility and accuracy of our algorithm is shown by measuring the specificity (>0.75) and sensitivity (>0.65) on 1H urine derived spectroscopic data. A GA successfully removes more than 60% of the features without sacrificing accuracy, thus reducing redundant and removing irrelevant data in the empirical dataset. This increase in efficiency is critical to extending and improving community annotated identification DBs.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.