Abstract

Thousands of articles using metabolomics approaches are published every year. With the increasing amounts of data being produced, mere description of investigations as text in manuscripts is not sufficient to enable re-use anymore: the underlying data needs to be published together with the findings in the literature to maximise the benefit from public and private expenditure and to take advantage of an enormous opportunity to improve scientific reproducibility in metabolomics and cognate disciplines. Reporting recommendations in metabolomics started to emerge about a decade ago and were mostly concerned with inventories of the information that had to be reported in the literature for consistency. In recent years, metabolomics data standards have developed extensively, to include the primary research data, derived results and the experimental description and importantly the metadata in a machine-readable way. This includes vendor independent data standards such as mzML for mass spectrometry and nmrML for NMR raw data that have both enabled the development of advanced data processing algorithms by the scientific community. Standards such as ISA-Tab cover essential metadata, including the experimental design, the applied protocols, association between samples, data files and the experimental factors for further statistical analysis. Altogether, they pave the way for both reproducible research and data reuse, including meta-analyses. Further incentives to prepare standards compliant data sets include new opportunities to publish data sets, but also require a little “arm twisting” in the author guidelines of scientific journals to submit the data sets to public repositories such as the NIH Metabolomics Workbench or MetaboLights at EMBL-EBI. In the present article, we look at standards for data sharing, investigate their impact in metabolomics and give suggestions to improve their adoption.

Highlights

  • Data standardisation efforts can trigger ambivalent and often polarised reactions

  • We look at standards for data sharing, investigate their impact in metabolomics and give suggestions to improve their adoption

  • With biological assays increasingly represented in digital form, biology has become a data-intensive field of disparate methods, with images, sequence reads and spectra, to name only a few, all being acquired by the droves

Read more

Summary

14 Page 2 of 13

Standards such as ISA-Tab cover essential metadata, including the experimental design, the applied protocols, association between samples, data files and the experimental factors for further statistical analysis. They pave the way for both reproducible research and data reuse, including meta-analyses. Further incentives to prepare standards compliant data sets include new opportunities to publish data sets, and require a little ‘‘arm twisting’’ in the author guidelines of scientific journals to submit the data sets to public repositories such as the NIH Metabolomics Workbench or MetaboLights at EMBL-EBI. We look at standards for data sharing, investigate their impact in metabolomics and give suggestions to improve their adoption

Introduction
Mass spectrometry raw data standards
14 Page 4 of 13
NMR raw data standards
14 Page 6 of 13
Study design and experimental metadata standards
Formats for standardised metadata capture
How to weave data standards into life-science experiments
14 Page 8 of 13
14 Page 10 of 13
Conclusion
Compliance with ethical standards
Findings
14 Page 12 of 13
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call