Using ReproZip for Reproducibility and Library Services

Vicky Steeves,Rémi Rampin,Fernando Chirigati

doi:10.29173/iq18

Abstract

Achieving research reproducibility is challenging in many ways: there are social and cultural obstacles as well as a constantly changing technical landscape that makes replicating and reproducing research difficult. Users face challenges in reproducing research across different operating systems, in using different versions of software across long projects and among collaborations, and in using publicly available work. The dependencies required to reproduce the computational environments in which research happens can be exceptionally hard to track – in many cases, these dependencies are hidden or nested too deeply to discover, and thus impossible to install on a new machine, which means adoption remains low. In this paper, we present ReproZip , an open source tool to help overcome the technical difficulties involved in preserving and replicating research, applications, databases, software, and more. We will examine the current use cases of ReproZip , ranging from digital humanities to machine learning. We also explore potential library use cases for ReproZip, particularly in digital libraries and archives, liaison librarianship, and other library services. We believe that libraries and archives can leverage ReproZip to deliver more robust reproducibility services, repository services, as well as enhanced discoverability and preservation of research materials, applications, software, and computational environments.

Highlights

Reproducibility is at the core of the research process: it is essential for verification and authentication of results, and for driving a field forward
Despite the widespread attention drawn to the subject following the Reproducibility Project: Psychology, carried out by the Center for Open Science (Open Science Collaboration, 2015), reproducibility still remains an elusive target for many researchers (Goodman, Fanelli, and Ioannidis 2016)
Gronenschild et al (2012) discussed how the results of data analyses in neuroscience performed with the same application differed based on the operating system: We investigated the effects of data processing variables such as FreeSurfer version (v4.3.1, v4.5.0, and v5.0.0), workstation (Macintosh and Hewlett-Packard), and Macintosh operating system version (OS X 10.5 and OS X 10.6)

Summary

Introduction

Reproducibility is at the core of the research process: it is essential for verification and authentication of results, and for driving a field forward. There may be many unforeseen dependencies for each software or tool, of which different versions from the original configuration may give totally disparate results or not even run To manually address these problems, collectively known as ‘dependency hell,’ researchers enter into an errorprone and resource-heavy process. They would have to create a file that encapsulates metadata about their computational environment, including the operating system, hardware architecture, and software library dependencies. ReproZip packages are highly portable, in that it automatically creates a virtual machine for the user – no extra work required beyond one click or command – allowing research to be reproduced across different operating systems. While ReproZip has primarily been used in research, in this paper, we explore the many ways in which librarians can use ReproZip, from helping user populations create well-managed, reproducible research, to preserving computational environments, and to building library infrastructure

Technical Infrastructure

Packing

Unpacking

Current Use Cases

ReproZip in Librarianship

Digital Libraries

Repository Management

Academic Libraries

Future Development Work

ReproZip-Jupyter

Workflow Visualizations and Graphs

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IASSIST quarterly	Publication Date: Dec 12, 2017
Citations: 6	License type: cc-by

R Discovery Prime

R Discovery Prime

Using ReproZip for Reproducibility and Library Services

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IASSIST quarterly

Lead the way for us

Similar Papers

A framework to acquire explicit knowledge stored on different versions of software
Mario Barcelo-Valenzuela ... Gerardo Sanchez-Schmitz
Information and software technology | VOL. 70
Mario Barcelo-Valenzuela, et. al.Mario Barcelo-Valenzuela ... Gerardo Sanchez-Schmitz
27 Oct 2015
Information and software technology | VOL. 70

Feasibility and reproducibility of left ventricular rotation by speckle tracking echocardiography in elderly individuals and the impact of different software.
Chloe M Park ... Partha Mukhopadhyay
PloS one | VOL. 8
Chloe M Park, et. al.Chloe M Park ... Partha Mukhopadhyay
13 Sep 2013
PloS one | VOL. 8

Reproducible Workflows and Compute Environments for Reusable Datasets, Simulations and Research Software
Alan Correa ... Anil Yildiz
-
Alan Correa, et. al.Alan Correa ... Anil Yildiz
11 Mar 2024
11 Mar 2024

Do lab modules in CS actually help students?
Bunny J Tjaden
-
Bunny J TjadenBunny J Tjaden
01 Mar 1998
01 Mar 1998

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using ReproZip for Reproducibility and Library Services

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IASSIST quarterly