Detecting high-scoring local alignments in pangenome graphs.

Tizian Schulz,Jens Stoye,Sven Rahmann,Faraz Hach,Roland Wittler

doi:10.1093/bioinformatics/btab077

Tizian Schulz, Jens Stoye + Show 3 more

Open Access

PDF Available

https://doi.org/10.1093/bioinformatics/btab077

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

MotivationIncreasing amounts of individual genomes sequenced per species motivate the usage of pangenomic approaches. Pangenomes may be represented as graphical structures, e.g. compacted colored de Bruijn graphs, which offer a low memory usage and facilitate reference-free sequence comparisons. While sequence-to-graph mapping to graphical pangenomes has been studied for some time, no local alignment search tool in the vein of BLAST has been proposed yet.ResultsWe present a new heuristic method to find maximum scoring local alignments of a DNA query sequence to a pangenome represented as a compacted colored de Bruijn graph. Our approach additionally allows a comparison of similarity among sequences within the pangenome. We show that local alignment scores follow an exponential-tail distribution similar to BLAST scores, and we discuss how to estimate its parameters to separate local alignments representing sequence homology from spurious findings. An implementation of our method is presented, and its performance and usability are shown. Our approach scales sublinearly in running time and memory usage with respect to the number of genomes under consideration. This is an advantage over classical methods that do not make use of sequence similarity within the pangenome.Availability and implementationSource code and test data are available from https://gitlab.ub.uni-bielefeld.de/gi/plast.Supplementary informationSupplementary data are available at Bioinformatics online.

Highlights

We study the problem of finding high scoring local alignments between a query sequence and a graph that are likely to represent sequence homology
Our algorithm finds high scoring local alignments between a given query sequence q and a pangenome represented as a compacted colored de Bruijn graph G = (V, E, λ, C) over the DNA alphabet and a color set U
We show the advantage in runtime, memory usage and result aggregation when searching local alignments inside a pangenome with our method compared to a conventional search and analysis using other BLAST-like software tools

Summary

Introduction

A pangenome is defined as a set of genomic sequences that may be stored and analysed collectively while being represented as a single entity. The pangenomic approach allows a high memory saving potential as sequence parts shared by multiple genomes have to be stored only once. It enables the simultaneous comparison of a large number of individual genomes while avoiding classical reference-based analyses which turned out to have shortcomings in various cases [6, 10]. A method was published allowing exact read mapping on general graphs [36]. Other solutions have been presented by Antipov et al [5] and Kavya et al [22]

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bioinformatics	Publication Date: Feb 3, 2021
Citations: 7	License type: CC BY 4.0

R Discovery Prime

Detecting high-scoring local alignments in pangenome graphs.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Similar Papers

Analyzing the Interaction of RseA and RseB, the Two Negative Regulators of the σE Envelope Stress Response, Using a Combined Bioinformatic and Experimental Strategy
Nidhi Ahuja ... Carol A Gross
Journal of Biological Chemistry | VOL. 284
Nidhi Ahuja, et. al.Nidhi Ahuja ... Carol A Gross
01 Feb 2009
Journal of Biological Chemistry | VOL. 284

General continuous-time Markov model of sequence evolution via insertions/deletions: local alignment probability computation.
Kiyoshi Ezawa
BMC bioinformatics | VOL. 17
Kiyoshi EzawaKiyoshi Ezawa
27 Sep 2016
BMC bioinformatics | VOL. 17

LocalAli: an evolutionary-based local alignment approach to identify functionally conserved modules in multiple networks.
Jialu Hu ... Knut Reinert
Bioinformatics | VOL. 31
Jialu Hu, et. al.Jialu Hu ... Knut Reinert
04 Oct 2014
Bioinformatics | VOL. 31

Smith-Waterman algorithm on heterogeneous systems: A case study
Enzo Rucci ... Manuel Prieto-Matias
-
Enzo Rucci, et. al.Enzo Rucci ... Manuel Prieto-Matias
01 Sep 2014
01 Sep 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Detecting high-scoring local alignments in pangenome graphs.

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: Bioinformatics