Fast exact gap-affine partial order alignment with POASTA

Lucas R Van Dijk,Abigail L Manson,Ashlee M Earl,Kiran V Garimella,Thomas Abeel

doi:10.1093/bioinformatics/btae757

Lucas R Van Dijk, Abigail L Manson + Show 3 more

Open Access

https://doi.org/10.1093/bioinformatics/btae757

Copy DOI

Export

Save

Cite

Journal: Bioinformatics	Publication Date: Jan 3, 2025
License type: CC BY 4.0

Abstract
Full-Text
Similar Papers

Abstract

Listen

Abstract Motivation Partial order alignment is a widely used method for computing multiple sequence alignments, with applications in genome assembly and pangenomics, among many others. Current algorithms to compute the optimal, gap-affine partial order alignment do not scale well to larger graphs and sequences. While heuristic approaches exist, they do not guarantee optimal alignment and sacrifice alignment accuracy. Results We present POASTA, a new optimal algorithm for partial order alignment that exploits long stretches of matching sequence between the graph and a query. We benchmarked POASTA against the state-of-the-art on several diverse bacterial gene datasets and demonstrated an average speed-up of 4.1x and up to 9.8x, using less memory. POASTA’s memory scaling characteristics enabled the construction of much larger POA graphs than previously possible, as demonstrated by megabase-length alignments of 342 Mycobacterium tuberculosis sequences. Availability and implementation POASTA is available on Github at https://github.com/broadinstitute/poasta.

Full Text