Streamlined Conversion of Omics Metadata into Manuscript Facilitates Publishing and Reuse of Omics Data

Mariya Dimitrova,Seyhan Demirov,Pier Luigi Buttigieg,Lyubomir Penev,Vincent Smith,Georgi Zhelezov,Raïssa Meyer,Teodor Georgiev

doi:10.3897/biss.4.59041

Abstract

Data papers have started to gain popularity as a publishing format that allows easy and quick publishing of research data (Chavan and Penev 2011, Penev et al. 2017). They describe single or multiple datasets and the methodologies required for their generation. Similar to traditional research articles, data papers and the underlying datasets are peer-reviewed. In this poster, we demonstrate how data papers can be used to incentivise researchers producing omics datasets to increase the quality of the metadata descriptors and the data itself through the journal authoring, peer review and publication process, thus improving data visibility, discoverability, sharing and reuse. We illustrate a highly automated workflow for the creation of omics data paper manuscripts, which started with the development of a template for this specific article type in the Biodiversity Data Journal (BDJ), published by Pensoft (Dimitrova et al. 2020). The workflow streamlines automatic conversion and import of metadata from the European Nucleotide Archive (ENA) into an omics data paper manuscript created in the ARPHA Writing Tool (AWT), following a three step procedure: mapping of the European Nucleotide Archive (ENA) metadata to the manuscript sections, extraction of the relevant metadata through the ENA project or study ID, and transforming the metadata into HTML or XML files. The XML file follows the Journal Article Tag Suite (JATS) standard and can be used by anyone as a draft to further develop a data paper manuscript and submit it to any journal. mapping of the European Nucleotide Archive (ENA) metadata to the manuscript sections, extraction of the relevant metadata through the ENA project or study ID, and transforming the metadata into HTML or XML files. The XML file follows the Journal Article Tag Suite (JATS) standard and can be used by anyone as a draft to further develop a data paper manuscript and submit it to any journal. Records in ENA sometimes have linked data in the ArrayExpress and BioSamples databases, which describe sequencing experiments and samples following the community-accepted metadata standards MINSEQE and MIxS. The workflow also retrieves such records and inserts them both into the omics data paper narrative and as supplementary data files. The workflow has been integrated with Pensoft's ARPHA platform but the conversion code is openly accessible on GitHub under the Apache 2.0 license and can be run as a R Shiny app. By openly providing access to the code and its implementation in a web application, we enable the full reproducibility of the streamlined import of ENA metadata into an omics data paper manuscript. The plan is to further develop the workflow to include the import of various other types of omics data and omics data repositories in addition to the currently supported ENA genomic data. The workflow reaffirms the important role of high-quality metadata for creating extended dataset descriptions, recognised by Chavan and Penev 2011. Conversion of metadata into a manuscript helped us discover many datasets with insufficient or inaccurate metadata. Hence, we hope that our workflow promotes not only omics data paper publishing but also better metadata authoring and curation.

Highlights

Data papers have started to gain popularity as a publishing format that allows easy and quick publishing of research data (Chavan and Penev 2011, Penev et al 2017)
We demonstrate how data papers can be used to incentivise researchers producing omics datasets to increase the quality of the metadata descriptors and the data itself through the journal authoring, peer review and publication process, improving data visibility, discoverability, sharing and reuse
We illustrate a highly automated workflow for the creation of omics data paper manuscripts, which started with the development of a template for this specific article type in the Biodiversity Data Journal (BDJ), published by Pensoft (Dimitrova et al 2020)

Summary

Introduction

Data papers have started to gain popularity as a publishing format that allows easy and quick publishing of research data (Chavan and Penev 2011, Penev et al 2017). Corresponding author: Mariya Dimitrova (m.dimitrova@pensoft.net) Received: 28 Sep 2020 | Published: 28 Sep 2020 Citation: Dimitrova M, Meyer R, Buttigieg PL, Georgiev T, Zhelezov G, Demirov S, Smith VS, Penev L (2020) Streamlined Conversion of Omics Metadata into Manuscript Facilitates Publishing and Reuse of Omics Data. Similar to traditional research articles, data papers and the underlying datasets are peerreviewed.

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Streamlined Conversion of Omics Metadata into Manuscript Facilitates Publishing and Reuse of Omics Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Biodiversity Information Science and Standards

Lead the way for us

Journal: Biodiversity Information Science and Standards	Publication Date: Sep 28, 2020
License type: CC BY 4.0

Similar Papers

Simultaneous Integration of Multi-omics Data Improves the Identification of Cancer Driver Modules.
Dana Silverbush ... Simona Cristea
Cell Systems | VOL. 8
Dana Silverbush, et. al.Dana Silverbush ... Simona Cristea
01 May 2019
Cell Systems | VOL. 8

An integrative U method for joint analysis of multi-level omic data
Pei Geng ... Qing Lu
BMC Genetics | VOL. 20
Pei Geng, et. al.Pei Geng ... Qing Lu
10 Apr 2019
BMC Genetics | VOL. 20

XML 101 for Journal Production Editors
Heather Diangelis
Science Editor | VOL. -
Heather DiangelisHeather Diangelis
24 Aug 2021
Science Editor | VOL. -

Delta.AR: An augmented reality-based visualization platform for 3D genome
Bixia Tang ... Zhihua Zhang
The Innovation | VOL. 2
Bixia Tang, et. al.Bixia Tang ... Zhihua Zhang
01 Aug 2021
The Innovation | VOL. 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Streamlined Conversion of Omics Metadata into Manuscript Facilitates Publishing and Reuse of Omics Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Biodiversity Information Science and Standards