Optimal Omnitig Listing for Safe and Complete Contig Assembly

Massimo Cairo ,Nidia Obscura Acosta ,Paul Medvedev ,Roméo Rizzi ,Alexandru I Tomescu

doi:10.4230/lipics.cpm.2017.29

Abstract

Genome assembly is the problem of reconstructing a genome sequence from a set of reads from a sequencing experiment. Typical formulations of the assembly problem admit in practice many genomic reconstructions, and actual genome assemblers usually output contigs, namely substrings that are promised to occur in the genome. To bridge the theory and practice, Tomescu and Medvedev [RECOMB 2016] reformulated contig assembly as finding all substrings common to all genomic reconstructions. They also gave a characterization of those walks (omnitigs) that are common to all closed edge-covering walks of a (directed) graph, a typical notion of genomic reconstruction. An algorithm for listing all maximal omnitigs was also proposed, by launching an exhaustive visit from every edge. In this paper, we prove new insights about the structure of omnitigs and solve several open questions about them. We combine these to achieve an O(nm)-time algorithm for outputting all the maximal omnitigs of a graph (with n nodes and m edges). This is also optimal, as we show families of graphs whose total omnitig length is Omega(nm). We implement this algorithm and show that it is 9-12 times faster in practice than the one of Tomescu and Medvedev [RECOMB 2016].

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Optimal Omnitig Listing for Safe and Complete Contig Assembly

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Culture, identification and related mechanism research of Proteus vulgaris KM-19
...
-
, et. al. ...
01 Jan 2020
01 Jan 2020

Completion of human Chromosome 21, the Human Genome Project, and Steps towards Understanding Ourselves through Comparative Genomics

Journal of Genetics and Molecular Biology | VOL. 11

01 Sep 2000
Journal of Genetics and Molecular Biology | VOL. 11

Physical, Transcriptional and Comparative Mapping on the Human X Chromosome

-

19 Jun 2002
19 Jun 2002

SiRomics for universal diagnostics of plant viral disease and virus diversity studies

-

01 Jan 2017
01 Jan 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimal Omnitig Listing for Safe and Complete Contig Assembly

Abstract

Talk to us

Similar Papers