Abstract

In the era of big and omics data, good organization, management, and description of experimental data are crucial for achieving high-quality datasets. This, in turn, is essential for the export of robust results, to publish reliable papers, make data more easily available, and unlock the huge potential of data reuse. Lately, more and more journals now require authors to share data and metadata according to the FAIR (Findable, Accessible, Interoperable, Reusable) principles. This work aims to provide a step-by-step guideline for the FAIR data and metadata management specific to grapevine and wine science. In detail, the guidelines include recommendations for the organization of data and metadata regarding (i) meaningful information on experimental design and phenotyping, (ii) sample collection, (iii) sample preparation, (iv) chemotype analysis, (v) data analysis (vi) metabolite annotation, and (vii) basic ontologies. We hope that these guidelines will be helpful for the grapevine and wine metabolomics community and that it will benefit from the true potential of data usage in creating new knowledge being revealed.

Highlights

  • Thanks to the increasing availability of thousands of sequenced plant genomes [1], and the parallel uses of high-throughput analyses of next-generation sequencing techniques, such as the popular RNA sequencing [2], profiling the entire plant transcriptome and performing studies on gene expression during development and/or in response to biotic and environmental conditions [3] is quite straightforward

  • In the past few years, the advancement of the methodologies based on liquid and gas chromatography (LC and GC) coupled with mass spectrometry (MS) and nuclear magnetic resonance (NMR) spectrometry opened a new field of research, metabolomics [4,5,6], that makes it possible to perform largescale measurements of hundreds or even thousands of metabolites in one run with targeted or untargeted approaches [7]

  • Please note that the study ID can be used to retrieve the dataset in MetaboLights, but as it is only mentioned in the materials and method section, it is not sufficient to obtain the indexing of the dataset in the Data Citation Index (DCI) of Clarivate Web of Science, which would be desirable in order to connect any published paper to the FAIR data present in the public repository

Read more

Summary

Introduction

Thanks to the increasing availability of thousands of sequenced plant genomes [1], and the parallel uses of high-throughput analyses of next-generation sequencing techniques, such as the popular RNA sequencing [2], profiling the entire plant transcriptome and performing studies on gene expression during development and/or in response to biotic and environmental conditions [3] is quite straightforward. In the past few years, the advancement of the methodologies based on liquid and gas chromatography (LC and GC) coupled with mass spectrometry (MS) and nuclear magnetic resonance (NMR) spectrometry opened a new field of research, metabolomics [4,5,6], that makes it possible to perform largescale measurements of hundreds or even thousands of metabolites in one run with targeted or untargeted approaches [7]. Instrumental analysis, and data analysis protocols will deliver complementary (but not conflicting) datasets and, possibly slightly different conclusions. This higher complexity requires highly organized data and metadata management, and metabolomic data must be combined with a detailed set of metadata to be correctly read and reused outside the original experiment

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call