Abstract

Data papers have started to gain popularity as a publishing format that allows easy and quick publishing of research data (Chavan and Penev 2011, Penev et al. 2017). They describe single or multiple datasets and the methodologies required for their generation. Similar to traditional research articles, data papers and the underlying datasets are peer-reviewed. In this poster, we demonstrate how data papers can be used to incentivise researchers producing omics datasets to increase the quality of the metadata descriptors and the data itself through the journal authoring, peer review and publication process, thus improving data visibility, discoverability, sharing and reuse. We illustrate a highly automated workflow for the creation of omics data paper manuscripts, which started with the development of a template for this specific article type in the Biodiversity Data Journal (BDJ), published by Pensoft (Dimitrova et al. 2020). The workflow streamlines automatic conversion and import of metadata from the European Nucleotide Archive (ENA) into an omics data paper manuscript created in the ARPHA Writing Tool (AWT), following a three step procedure: mapping of the European Nucleotide Archive (ENA) metadata to the manuscript sections, extraction of the relevant metadata through the ENA project or study ID, and transforming the metadata into HTML or XML files. The XML file follows the Journal Article Tag Suite (JATS) standard and can be used by anyone as a draft to further develop a data paper manuscript and submit it to any journal. mapping of the European Nucleotide Archive (ENA) metadata to the manuscript sections, extraction of the relevant metadata through the ENA project or study ID, and transforming the metadata into HTML or XML files. The XML file follows the Journal Article Tag Suite (JATS) standard and can be used by anyone as a draft to further develop a data paper manuscript and submit it to any journal. Records in ENA sometimes have linked data in the ArrayExpress and BioSamples databases, which describe sequencing experiments and samples following the community-accepted metadata standards MINSEQE and MIxS. The workflow also retrieves such records and inserts them both into the omics data paper narrative and as supplementary data files. The workflow has been integrated with Pensoft's ARPHA platform but the conversion code is openly accessible on GitHub under the Apache 2.0 license and can be run as a R Shiny app. By openly providing access to the code and its implementation in a web application, we enable the full reproducibility of the streamlined import of ENA metadata into an omics data paper manuscript. The plan is to further develop the workflow to include the import of various other types of omics data and omics data repositories in addition to the currently supported ENA genomic data. The workflow reaffirms the important role of high-quality metadata for creating extended dataset descriptions, recognised by Chavan and Penev 2011. Conversion of metadata into a manuscript helped us discover many datasets with insufficient or inaccurate metadata. Hence, we hope that our workflow promotes not only omics data paper publishing but also better metadata authoring and curation.

Highlights

  • Data papers have started to gain popularity as a publishing format that allows easy and quick publishing of research data (Chavan and Penev 2011, Penev et al 2017)

  • We demonstrate how data papers can be used to incentivise researchers producing omics datasets to increase the quality of the metadata descriptors and the data itself through the journal authoring, peer review and publication process, improving data visibility, discoverability, sharing and reuse

  • We illustrate a highly automated workflow for the creation of omics data paper manuscripts, which started with the development of a template for this specific article type in the Biodiversity Data Journal (BDJ), published by Pensoft (Dimitrova et al 2020)

Read more

Summary

Introduction

Data papers have started to gain popularity as a publishing format that allows easy and quick publishing of research data (Chavan and Penev 2011, Penev et al 2017). Corresponding author: Mariya Dimitrova (m.dimitrova@pensoft.net) Received: 28 Sep 2020 | Published: 28 Sep 2020 Citation: Dimitrova M, Meyer R, Buttigieg PL, Georgiev T, Zhelezov G, Demirov S, Smith VS, Penev L (2020) Streamlined Conversion of Omics Metadata into Manuscript Facilitates Publishing and Reuse of Omics Data. Similar to traditional research articles, data papers and the underlying datasets are peerreviewed.

Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.