Abstract

MotivationMetabolomics is an increasingly common part of health research and there is need for preanalytical data processing. Researchers typically need to characterize the data and to exclude errors within the context of the intended analysis. Whilst some preprocessing steps are common, there is currently a lack of standardization and reporting transparency for these procedures.ResultsHere, we introduce metaboprep, a standardized data processing workflow to extract and characterize high quality metabolomics datasets. The package extracts data from preformed worksheets, provides summary statistics and enables the user to select samples and metabolites for their analysis based on a set of quality metrics. A report summarizing quality metrics and the influence of available batch variables on the data are generated for the purpose of open disclosure. Where possible, we provide users flexibility in defining their own selection thresholds.Availability and implementation metaboprep is an open-source R package available at https://github.com/MRCIEU/metaboprep.Supplementary information Supplementary data are available at Bioinformatics online.

Highlights

  • In the last decade, the study of chemical products arising from biological processes has moved from chemometrics to epidemiology (Ala-Korpela, 2015)

  • This paper introduces metaboprep, an R package developed to help those working with curated metabolomics data to achieve transparent and informed processing of their study sample data prior to statistical analysis

  • We demonstrate the use of metaboprep using the Born in Bradford (BiB) cohort, including 1,000 pregnant women with UPLC-mass spectrometry (MS)/MS data (Metabolon), and The Avon Longitudinal Study of Parents and Children (ALSPAC), a birth cohort with 3,361 samples collected during early adulthood and analysed by nuclear magnetic resonance (NMR) (Nightingale Health)

Read more

Summary

Introduction

The study of chemical products arising from biological processes has moved from chemometrics to epidemiology (Ala-Korpela, 2015). With rapid advances in technology and bioinformatics enabling the quantification of hundreds or even thousands of metabolites from a single biological sample, there is potential for these measurements to reveal valuable insights into biology and health. Both mass spectrometry (MS) and nuclear magnetic resonance (NMR) are common technologies used in these untargeted studies. Laboratories have their own established protocols in sample preparation, generation of standards and controls, and corrections for instrument and run day variability. Researchers are able to access high quality curated metabolomics data at scale

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call