Reproducible acquisition, management and meta-analysis of nucleotide sequence (meta)data using q2-fondue.

Michal Ziemski,Nicholas A Bokulich,Lena Flörl,Anja Adamov,Lina Kim

doi:10.1093/bioinformatics/btac639

Abstract

The volume of public nucleotide sequence data has blossomed over the past two decades and is ripe for re- and meta-analyses to enable novel discoveries. However, reproducible re-use and management of sequence datasets and associated metadata remain critical challenges. We created the open source Python package q2-fondue to enable user-friendly acquisition, re-use and management of public sequence (meta)data while adhering to open data principles. q2-fondue allows fully provenance-tracked programmatic access to and management of data from the NCBI Sequence Read Archive (SRA). Unlike other packages allowing download of sequence data from the SRA, q2-fondue enables full data provenance tracking from data download to final visualization, integrates with the QIIME 2 ecosystem, prevents data loss upon space exhaustion and allows download of (meta)data given a publication library. To highlight its manifold capabilities, we present executable demonstrations using publicly available amplicon, whole genome and metagenome datasets. q2-fondue is available as an open-source BSD-3-licensed Python package at https://github.com/bokulich-lab/q2-fondue. Usage tutorials are available in the same repository. All Jupyter notebooks used in this article are available under https://github.com/bokulich-lab/q2-fondue-examples. Supplementary data are available at Bioinformatics online.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bioinformatics	Publication Date: Sep 20, 2022
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Reproducible acquisition, management and meta-analysis of nucleotide sequence (meta)data using q2-fondue.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Similar Papers

Pysradb: A Python package to query next-generation sequencing metadata and data from NCBI Sequence Read Archive.
Saket Choudhary
F1000Research | VOL. 8
Saket ChoudharySaket Choudhary
23 Apr 2019
F1000Research | VOL. 8

Pysradb: A Python package to query next-generation sequencing metadata and data from NCBI Sequence Read Archive
Ryan K Dale ... Saket Choudhary
F1000Research | VOL. 8
Ryan K Dale, et. al.Ryan K Dale ... Saket Choudhary
26 Apr 2019
F1000Research | VOL. 8

Don't just dump your data and run: Authors should submit as much experimental information as possible when uploading sequence data.
Matheus Sanitá Lima ... David Roy Smith
EMBO reports | VOL. 18
Matheus Sanitá Lima, et. al.Matheus Sanitá Lima ... David Roy Smith
27 Oct 2017
EMBO reports | VOL. 18

Post-archival genomics and the bulk logistics of DNA sequences
Adrian Mackenzie ... Ruth Mcnally
BioSocieties | VOL. 11
Adrian Mackenzie, et. al.Adrian Mackenzie ... Ruth Mcnally
29 Jun 2015
BioSocieties | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Reproducible acquisition, management and meta-analysis of nucleotide sequence (meta)data using q2-fondue.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics