CODECHECK: An open-science initiative to facilitate sharing of computer programs and results presented in scientific publications

Stephen Eglen,Daniel Nüst

doi:10.7557/5.4910

Abstract

Analysis of data and computational modelling is central to most scientific disciplines. The underlying computer programs are complex and costly to design. However, these computational techniques are rarely checked during review of the corresponding papers, nor shared upon publication. Instead, the primary method for sharing data and computer programs today is for authors to state "data available upon reasonable request", although the actual code and data is the only sufficiently detailed description of a computational workflow that allows reproduction and reuse. Despite best intentions, these programs and data can quickly disappear from laboratories. Furthermore, there is a reluctance to share: only 8% of papers in recent top-tier AI conferences shared code relating to their publications (Gundersen et al. 2018). This low-rate of code sharing is seen in other fields, e.g. computational physics (Stodden et al. 2018). Given that code and data are rich digital artefacts that can be shared relatively easily, and that funders and journal publishers increasingly mandate sharing of resources, we should be sharing more and follow best practices for data and software publication. The permanent archival of valuable code and datasets would allow other researchers to make use of these resources in their work, and improve the reliability of reporting as well as the quality of tools. We are building a computational platform, called CODECHECK (http://www.codecheck.org.uk), to enhance the availability, discovery and reproducibility of published computational research. Researchers that provide code and data will have their code independently run to ensure the computational parts of a workflow can be reproduced. The results from our independent run will then be shared freely post-publication in an open repository. The reproduction is attributed to the person perfoming the check. Our independent runs will act as a "certificate of reproducible computation". These certificates will be of use to several parties at different times during the generation of a scientific publication.  Prior to peer review, the researchers themselves can check that their code runs on a separate platform. During peer review, editors and reviewers can check if the figures in the certificate match those presented in manuscripts for review without cumbersome download and installation procedures. Once published, any interested reader can download the software and even data that was used to generate the results shown in the certificate.  The code and results from papers are shared according to the principles we recently outlined (Eglen et al. 2017). To ensure our system scales to large numbers of papers and is trustworthy, our system will be as automated as possible, fully open itself, and rely on open source software and open scholarly infrastructure. This presentation will discuss the challenges faced to date in building the system and in connecting it with existing peer-review principles, and plans for links with open access journals. Acknolwedgements This work has been funded by the UK Software Sustainability Institute, a Mozilla Open Science Mini grant and the German Research Foundation (DFG) under project number PE 1632/17-1.

Highlights

Buckheit & Donoho (1995) The problem is that most modern science is so complicated, and most journal articles so brief, it’s impossible for the article to include details of many important methods and decisions made by the researcher Marwick (2015)
Certificates and snapshot of data/code/outputs deposited on Zenodo by Codechecker
Next steps1. How to wrap up meta data of certificate and artifacts such that they are useful and reusable. 2. Embedding into journal workflows. 3. Training a community of codecheckers. 4. Generate portfolio of examples. For more information please see: http://codecheck.org.uk

Summary

Why share code?

An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. Buckheit & Donoho (1995) The problem is that most modern science is so complicated, and most journal articles so brief, it’s impossible for the article to include details of many important methods and decisions made by the researcher Marwick (2015)

The CODECHECK philosophy

Who bene ts?

Limitations

Next steps

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Septentrio Conference Series	Publication Date: Sep 20, 2019
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

CODECHECK: An open-science initiative to facilitate sharing of computer programs and results presented in scientific publications

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Septentrio Conference Series

Lead the way for us

Similar Papers

JID Innovations and Peer Review
Russell P Hall
Journal of Investigative Dermatology Innovations | VOL. 1
Russell P HallRussell P Hall
01 Sep 2021
Journal of Investigative Dermatology Innovations | VOL. 1

Transparency, accountability, reward and recognition.
Celia M Marr
Equine veterinary journal | VOL. 52
Celia M MarrCelia M Marr
10 Dec 2019
Equine veterinary journal | VOL. 52

Peer review on the Internet: A better class of conversation
Craig Bingham
The Lancet | VOL. 351
Craig BinghamCraig Bingham
01 Mar 1998
The Lancet | VOL. 351

Peer Review – the future is here
Maria Papatriantafyllou
FEBS Letters | VOL. 591
Maria PapatriantafyllouMaria Papatriantafyllou
01 Sep 2017
FEBS Letters | VOL. 591

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CODECHECK: An open-science initiative to facilitate sharing of computer programs and results presented in scientific publications

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Septentrio Conference Series