Toward collaborative open data science in metabolomics using Jupyter Notebooks and cloud computing

Kevin M Mendez,David I Broadhurst,Leighton Pritchard,Stacey N Reinke

doi:10.1007/s11306-019-1588-0

Kevin M Mendez, David I Broadhurst + Show 2 more

Open Access

https://doi.org/10.1007/s11306-019-1588-0

Copy DOI

Journal: Metabolomics	Publication Date: Sep 14, 2019
Citations: 59	License type: open-access

Affiliation: Edith Cowan University, University of Strathclyde

Abstract

BackgroundA lack of transparency and reporting standards in the scientific community has led to increasing and widespread concerns relating to reproduction and integrity of results. As an omics science, which generates vast amounts of data and relies heavily on data science for deriving biological meaning, metabolomics is highly vulnerable to irreproducibility. The metabolomics community has made substantial efforts to align with FAIR data standards by promoting open data formats, data repositories, online spectral libraries, and metabolite databases. Open data analysis platforms also exist; however, they tend to be inflexible and rely on the user to adequately report their methods and results. To enable FAIR data science in metabolomics, methods and results need to be transparently disseminated in a manner that is rapid, reusable, and fully integrated with the published work. To ensure broad use within the community such a framework also needs to be inclusive and intuitive for both computational novices and experts alike.Aim of ReviewTo encourage metabolomics researchers from all backgrounds to take control of their own data science, mould it to their personal requirements, and enthusiastically share resources through open science.Key Scientific Concepts of ReviewThis tutorial introduces the concept of interactive web-based computational laboratory notebooks. The reader is guided through a set of experiential tutorials specifically targeted at metabolomics researchers, based around the Jupyter Notebook web application, GitHub data repository, and Binder cloud computing platform.

Highlights

Journal articles have been the primary medium for sharing new scientific research
We provide a brief overview of current data science frameworks relevant to the metabolomics community, corresponding barriers to achieving open science, and a practical solution in the form of the computational lab notebook, where code, prose and figures are combined into an interactive notebook that can be published online and accessed in a modern web browser through cloud computing
The remainder of this review provides readers with an experiential learning opportunity (Kolb 1984) using an example interactive metabolomics data analysis workflow deployed using a combination of Python, Jupyter Notebooks, and Binder

Summary

Introduction

Journal articles have been the primary medium for sharing new scientific research. To fully embrace the concept of ‘open data science’ the metabolomics community needs an open and accessible computational environment for rapid collaboration and experimentation The subject of this tutorial review is a practical open-science solution to this problem that balances ease-of-use and flexibility, targeted to novice metabolomic data scientists. This solution takes the form of ‘computational lab books’, such as Jupyter Notebooks (Kluyver et al 2016), that have a diverse range of overlapping potential applications in the post-genomic research community (Fig. 1). The overarching aim of this document is to encourage metabolomics researchers from all backgrounds, possibly with little or no computational expertise, to seize the opportunity to take control of their own data science, mould it to their personal requirements, and enthusiastically share resources through open science

Software tools and barriers to open science

Collaboration through cloud computing

Experiential learning tutorials

Jupyter Notebook

GitHub

Binder

Tutorial 1: launching and using a Jupyter Notebook on Binder

Tutorial 2: interacting with and editing a Jupyter

Tutorial 3: downloading and installing a Jupyter

Tutorial 4: creating a new Jupyter Notebook on a local computer

Tutorial 5: deploying a Jupyter Notebook on Binder via GitHub

Summary

Findings

Compliance with ethical standards

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Toward collaborative open data science in metabolomics using Jupyter Notebooks and cloud computing

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Metabolomics

Lead the way for us

Similar Papers

About The Author
-
Data intelligence | VOL. 4
--
01 Oct 2022
Data intelligence | VOL. 4

Minimizing Data Waste: Conservation in the Big Data Era
Allison D Binley ... Elly C Knight
The Bulletin of the Ecological Society of America | VOL. 104
Allison D Binley, et. al.Allison D Binley ... Elly C Knight
10 Mar 2023
The Bulletin of the Ecological Society of America | VOL. 104

African Open Science Platform Part 1: Landscape Study
-
-
--
01 Jan 2019
01 Jan 2019

More than Open Data mandates: a staged model for achieving Open Access to scientific data

-

18 Mar 2019
18 Mar 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Toward collaborative open data science in metabolomics using Jupyter Notebooks and cloud computing

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Metabolomics