RepairNatrix: a Snakemake workflow for processing DNA sequencing data for DNA storage.

Peter Michael Schwarz,Marius Welzel,Dominik Heider,Bernd Freisleben

doi:10.1093/bioadv/vbad117

Peter Michael Schwarz, Marius Welzel + Show 2 more

Open Access

https://doi.org/10.1093/bioadv/vbad117

Copy DOI

Journal: Bioinformatics Advances	Publication Date: Jan 5, 2023
Citations: 1	License type: CC BY 4.0

Affiliation: Philipps University of Marburg

Abstract

There has been rapid progress in the development of error-correcting and constrained codes for DNA storage systems in recent years. However, improving the steps for processing raw sequencing data for DNA storage has a lot of untapped potential for further progress. In particular, constraints can be used as prior information to improve the processing of DNA sequencing data. Furthermore, a workflow tailored to DNA storage codes enables fair comparisons between different approaches while leading to reproducible results. We present RepairNatrix, a read-processing workflow for DNA storage. RepairNatrix supports preprocessing of raw sequencing data for DNA storage applications and can be used to flag and heuristically repair constraint-violating sequences to further increase the recoverability of encoded data in the presence of errors. Compared to a preprocessing strategy without repair functionality, RepairNatrix reduced the number of raw reads required for the successful, error-free decoding of the input files by a factor of 25-35 across different datasets. RepairNatrix is available on Github: https://github.com/umr-ds/repairnatrix.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

RepairNatrix: a Snakemake workflow for processing DNA sequencing data for DNA storage.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics Advances

Lead the way for us

Similar Papers

NanoForms: an integrated server for processing, analysis and assembly of raw sequencing data of microbial genomes, from Oxford Nanopore technology.
Anna Czmil ... Tomasz Wołkowicz
PeerJ | VOL. 10
Anna Czmil, et. al.Anna Czmil ... Tomasz Wołkowicz
29 Mar 2022
PeerJ | VOL. 10

LRTK: a platform agnostic toolkit for linked-read analysis of both human genome and metagenome.
Chao Yang ... Lu Zhang
GigaScience | VOL. 13
Chao Yang, et. al.Chao Yang ... Lu Zhang
02 Jan 2024
GigaScience | VOL. 13

The Microbial Antarctic Resource System: Integrating discoverability and preservation of environmentally-annotated microbial 'omics data
Maxime Sweetlove ... Yi Ming Gan
Biodiversity Information Science and Standards | VOL. 3
Maxime Sweetlove, et. al.Maxime Sweetlove ... Yi Ming Gan
10 Jul 2019
Biodiversity Information Science and Standards | VOL. 3

Automated processing of raw DNA sequence data.
M.C Wendl ... A.T Chinwalla
IEEE Engineering in Medicine and Biology Magazine | VOL. 20
M.C Wendl, et. al.M.C Wendl ... A.T Chinwalla
01 Jan 2001
IEEE Engineering in Medicine and Biology Magazine | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

RepairNatrix: a Snakemake workflow for processing DNA sequencing data for DNA storage.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics Advances