Construction of a de Bruijn Graph for Assembly from a Truncated Suffix Tree

Bastien Cazaux,Eric Rivals,Thierry Lecroq

doi:10.1007/978-3-319-15579-1_8

Abstract

In the life sciences, determining the sequence of bio-molecules is essential step towards the understanding of their functions and interactions inside an organism. Powerful technologies allows to get huge quantities of short sequencing reads that need to be assemble to infer the complete target sequence. These constraints favour the use of a version de Bruijn Graph (DBG) dedicated to assembly. The de Bruijn Graph is usually built directly from the reads, which is time and space consuming. Given a set \(R\) of input words, well-known data structures, like the generalised suffix tree, can index all the substrings of words in \(R\). In the context of DBG assembly, only substrings of length \(k+1\) and some of length \(k\) are useful. A truncated version of the suffix tree can index those efficiently. As indexes are exploited for numerous purposes in bioinformatics, as read cleaning, filtering, or even analysis, it is important to enable the community to reuse an existing index to build the DBG directly from it. In an earlier work we provided the first algorithms when starting from a suffix tree or suffix array. Here, we exhibit an algorithm that exploits a reduced version of the truncated suffix tree and computes the DBG from it. Importantly, a variation of this algorithm is also shown to compute the contracted DBG, which offers great benefits in practice. Both algorithms are linear in time and space in the size of the output.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Construction of a de Bruijn Graph for Assembly from a Truncated Suffix Tree

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Linking indexing data structures to de Bruijn graphs: Construction and update
Bastien Cazaux ... Eric Rivals
Journal of Computer and System Sciences | VOL. 104
Bastien Cazaux, et. al.Bastien Cazaux ... Eric Rivals
19 Jul 2016
Journal of Computer and System Sciences | VOL. 104

Solving All-Pairs Suffix Prefix – Theory and Practice
Maan Haj Rachid ... Qutaibah Malluhi
-
Maan Haj Rachid, et. al.Maan Haj Rachid ... Qutaibah Malluhi
01 Jan 2015
01 Jan 2015

Data compression with truncated suffix trees
Joong Chae Na ... Kunsoo Park
-
Joong Chae Na, et. al. Joong Chae Na ... Kunsoo Park
28 Mar 2000
28 Mar 2000

Efficient reconfiguration algorithms of de Bruijn and Kautz networks into linear arrays
Rabah Harbane ... Marie-Claude Heydemann
Theoretical Computer Science | VOL. 263
Rabah Harbane, et. al.Rabah Harbane ... Marie-Claude Heydemann
01 Jul 2001
Theoretical Computer Science | VOL. 263

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Construction of a de Bruijn Graph for Assembly from a Truncated Suffix Tree

Abstract

Talk to us

Similar Papers