Abstract

Extracting metabolic features from liquid chromatography-mass spectrometry (LC-MS) data has been a long-standing bioinformatic challenge in untargeted metabolomics. Conventional feature extraction algorithms fail to recognize features with low signal intensities, poor chromatographic peak shapes, or those that do not fit the parameter settings. This problem also poses a challenge for MS-based exposome studies, as low-abundant metabolic or exposomic features cannot be automatically recognized from raw data. To address this data processing challenge, we developed an R package, JPA (short for Joint Metabolomic Data Processing and Annotation), to comprehensively extract metabolic features from raw LC-MS data. JPA performs feature extraction by combining a conventional peak picking algorithm and strategies for (1) recognizing features with bad peak shapes but that have tandem mass spectra (MS2) and (2) picking up features from a user-defined targeted list. The performance of JPA in global metabolomics was demonstrated using serial diluted urine samples, in which JPA was able to rescue an average of 25% of metabolic features that were missed by the conventional peak picking algorithm due to dilution. More importantly, the chromatographic peak shapes, analytical accuracy, and precision of the rescued metabolic features were all evaluated. Furthermore, owing to its sensitive feature extraction, JPA was able to achieve a limit of detection (LOD) that was up to thousands of folds lower when automatically processing metabolomics data of a serial diluted metabolite standard mixture analyzed in HILIC(−) and RP(+) modes. Finally, the performance of JPA in exposome research was validated using a mixture of 250 drugs and 255 pesticides at environmentally relevant levels. JPA detected an average of 2.3-fold more exposure compounds than conventional peak picking only.

Highlights

  • Liquid chromatography-mass spectrometry (LC-MS) is a high-throughput analytical platform that enables the unbiased detection and quantification of small molecules in biological samples

  • While it is possible to manually recognize metabolic features from the raw LC-MS data, omics-scale metabolic feature extraction has to rely on feature extraction programs as manual checking is tedious and time consuming

  • Metabolic feature extraction in JPA is composed of three functions

Read more

Summary

Introduction

Liquid chromatography-mass spectrometry (LC-MS) is a high-throughput analytical platform that enables the unbiased detection and quantification of small molecules in biological samples. Various algorithms, including centWave [10], GridMass [11], and others [12,13], have been proposed to automatically recognize the Gaussian-shaped extracted ion chromatograms (EICs) that represent real metabolic features in LC-MS data. Given their diverse concentrations and chemical properties, many metabolites do not present nice Gaussian-shaped EICs and cannot be recognized automatically, especially those at low concentrations. For these metabolites, conventional peak picking algorithms are not efficient. There is a great demand to develop novel bioinformatic solutions to recognize and extract these low-quality metabolic features in order to fully unleash the analytical power of the LC-MS platform

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.