The fragment assembly string graph

Eugene W Myers

doi:10.1093/bioinformatics/bti1114

Abstract

We present a concept and formalism, the string graph, which represents all that is inferable about a DNA sequence from a collection of shotgun sequencing reads collected from it. We give time and space efficient algorithms for constructing a string graph given the collection of overlaps between the reads and, in particular, present a novel linear expected time algorithm for transitive reduction in this context. The result demonstrates that the decomposition of reads into kmers employed in the de Bruijn graph approach described earlier is not essential, and exposes its close connection to the unitig approach we developed at Celera. This paper is a preliminary piece giving the basic algorithm and results that demonstrate the efficiency and scalability of the method. These ideas are being used to build a next-generation whole genome assembler called BOA (Berkeley Open Assembler) that will easily scale to mammalian genomes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

The fragment assembly string graph

Abstract

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Journal: Bioinformatics	Publication Date: Sep 1, 2005
Citations: 410

Similar Papers

Integration of string and de Bruijn graphs for genome assembly.
Yao-Ting Huang ... Chen-Fu Liao
Bioinformatics | VOL. 32
Yao-Ting Huang, et. al.Yao-Ting Huang ... Chen-Fu Liao
10 Jan 2016
Bioinformatics | VOL. 32

Manifold de Bruijn Graphs
Yu Lin ... Pavel A Pevzner
-
Yu Lin, et. al.Yu Lin ... Pavel A Pevzner
01 Jan 2014
01 Jan 2014

SparseAssembler2: Sparse k-mer Graph for Memory Efficient Genome Assembly
...
F1000Research | VOL. 2
, et. al. ...
18 Oct 2011
F1000Research | VOL. 2

An Approximate de Bruijn Graph Approach to Multiple Local Alignment and Motif Discovery in Protein Sequences
Rupali Patwardhan ... Haixu Tang
-
Rupali Patwardhan, et. al.Rupali Patwardhan ... Haixu Tang
01 Jan 2006
01 Jan 2006

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The fragment assembly string graph

Abstract

Talk to us

Similar Papers

More From: Bioinformatics