From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics.

Alejandra González-Beltrán,Rajaram Kaliyaperumal,Philippe Rocca-Serra,Maria Susana Avila-Garcia,Jun Zhao,Scott C Edmunds,Tak-Wah Lam,Peter Li,Mark Thompson,Marco Roos,Susanna-Assunta Sansone,Ruibang Luo,Tin-Lap Lee,Eelke Van Der Horst

doi:10.1371/journal.pone.0127612

Abstract

MotivationReproducing the results from a scientific paper can be challenging due to the absence of data and the computational tools required for their analysis. In addition, details relating to the procedures used to obtain the published results can be difficult to discern due to the use of natural language when reporting how experiments have been performed. The Investigation/Study/Assay (ISA), Nanopublications (NP), and Research Objects (RO) models are conceptual data modelling frameworks that can structure such information from scientific papers. Computational workflow platforms can also be used to reproduce analyses of data in a principled manner. We assessed the extent by which ISA, NP, and RO models, together with the Galaxy workflow system, can capture the experimental processes and reproduce the findings of a previously published paper reporting on the development of SOAPdenovo2, a de novo genome assembler.ResultsExecutable workflows were developed using Galaxy, which reproduced results that were consistent with the published findings. A structured representation of the information in the SOAPdenovo2 paper was produced by combining the use of ISA, NP, and RO models. By structuring the information in the published paper using these data and scientific workflow modelling frameworks, it was possible to explicitly declare elements of experimental design, variables, and findings. The models served as guides in the curation of scientific information and this led to the identification of inconsistencies in the original published paper, thereby allowing its authors to publish corrections in the form of an errata.AvailabilitySOAPdenovo2 scripts, data, and results are available through the GigaScience Database: http://dx.doi.org/10.5524/100044; the workflows are available from GigaGalaxy: http://galaxy.cbiit.cuhk.edu.hk; and the representations using the ISA, NP, and RO models are available through the SOAPdenovo2 case study website http://isa-tools.github.io/soapdenovo2/. Contact: philippe.rocca-serra@oerc.ox.ac.uk and susanna-assunta.sansone@oerc.ox.ac.uk.

Highlights

Several reports have highlighted the practical difficulties in reproducing results from published experiments [1,2,3,4]
Scientists are coming under increasing pressure from funding agencies to disseminate their research data and methods
Research outputs may be assigned a Digital Object Identifier (DOI), a process overseen by DataCite [80], possibly facilitating discovery and citation

Summary

Introduction

Several reports have highlighted the practical difficulties in reproducing results from published experiments [1,2,3,4]. Amongst the incentives tried by publishers are the lift on restrictions on the length of methods sections, the creation of data publication platforms, such as GigaScience [6] and Scientific Data [7], the provision of a statistical review of numerical results where appropriate and the requirement for data to be deposited in open-access repositories. These efforts have in part been driven by position statements from funding agencies, publishers and researchers advocating more widespread data sharing [8,9,10]. The NIH program Big Data to Knowledge (BD2K) constitutes a major initiative, aimed at making data dissemination and data preservation for all NIH funded work a reality, by mandating the creation of data access plans for all new grant applications [13]

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PloS one	Publication Date: Jul 8, 2015
Citations: 30	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one

Lead the way for us

Similar Papers

Semantically linking events for massive scientific literature research
Junsheng Zhang ... Changqing Yao
The Electronic Library | VOL. 35
Junsheng Zhang, et. al.Junsheng Zhang ... Changqing Yao
07 Aug 2017
The Electronic Library | VOL. 35

FAIR Research Objects for realizing Open Science with RELIANCE EOSC project
Anne Fouilloux ... Elisa Trasatti
Research Ideas and Outcomes | VOL. 8
Anne Fouilloux, et. al.Anne Fouilloux ... Elisa Trasatti
25 Aug 2022
Research Ideas and Outcomes | VOL. 8

Modeling for Sustainability
...
-
, et. al. ...
01 Jan 2018
01 Jan 2018

Research of Computer Assisted English Teaching Based on Corpus
Liang Li
-
Liang LiLiang Li
01 Jan 2015
01 Jan 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PloS one