Abstract

Processing liquid chromatography-mass spectrometry-based metabolomics data using computational programs often introduces additional quantitative uncertainty, termed computational variation in a previous work. This work develops a computational solution to automatically recognize metabolic features with computational variation in a metabolomics data set. This tool, AVIR (short for "Accurate eValuation of alIgnment and integRation"), is a support vector machine-based machine learning strategy (https://github.com/HuanLab/AVIR). The rationale is that metabolic features with computational variation have a poor correlation between chromatographic peak area and peak height-based quantifications across the samples in a study. AVIR was trained on a set of 696 manually curated metabolic features and achieved an accuracy of 94% in a 10-fold cross-validation. When tested on various external data sets from public metabolomics repositories, AVIR demonstrated an accuracy range of 84%-97%. Finally, tested on a large-scale metabolomics study, AVIR clearly indicated features with computational variation and thus guided us to manually correct them. Our results show that 75.3% of the samples with computational variation had a relative intensity difference of over 20% after correction. This demonstrates the critical role of AVIR in reducing computational variation to improve quantitative certainty in untargeted metabolomics analysis.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.