Parsnp 2.0: scalable core-genome alignment for massive microbial datasets.

Bryce Kille,Michael G Nute,Victor Huang,Eddie Kim,Adam M Phillippy,Todd J Treangen

doi:10.1093/bioinformatics/btae311

Bryce Kille, Michael G Nute + Show 4 more

Open Access

https://doi.org/10.1093/bioinformatics/btae311

Copy DOI

Journal: Bioinformatics (Oxford, England)	Publication Date: May 2, 2024
Citations: 1	License type: CC BY 4.0

Affiliation: Rice University

Abstract

Since 2016, the number of microbial species with available reference genomes in NCBI has more than tripled. Multiple genome alignment, the process of identifying nucleotides across multiple genomes which share a common ancestor, is used as the input to numerous downstream comparative analysis methods. Parsnp is one of the few multiple genome alignment methods able to scale to the current era of genomic data; however, there has been no major release since its initial release in 2014. To address this gap, we developed Parsnp v2, which significantly improves on its original release. Parsnp v2 provides users with more control over executions of the program, allowing Parsnp to be better tailored for different use-cases. We introduce a partitioning option to Parsnp, which allows the input to be broken up into multiple parallel alignment processes which are then combined into a final alignment. The partitioning option can reduce memory usage by over 4× and reduce runtime by over 2×, all while maintaining a precise core-genome alignment. The partitioning workflow is also less susceptible to complications caused by assembly artifacts and minor variation, as alignment anchors only need to be conserved within their partition and not across the entire input set. We highlight the performance on datasets involving thousands of bacterial and viral genomes. Parsnp v2 is available at https://github.com/marbl/parsnp.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Parsnp 2.0: scalable core-genome alignment for massive microbial datasets.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics (Oxford, England)

Lead the way for us

Similar Papers

Integration of Alignment and Phylogeny in the Whole-Genome Era

-

18 Jun 2015
18 Jun 2015

Novel Computational Methods for Large Scale Genome Comparison
Todd J Treangen ... Xavier Messeguer
-
Todd J Treangen, et. al.Todd J Treangen ... Xavier Messeguer
01 Jan 2009
01 Jan 2009

Key parameters for genomics-based real-time detection and tracking of multidrug-resistant bacteria: a systematic analysis
Claire L Gorrie ... Benjamin P Howden
The Lancet Microbe | VOL. 2
Claire L Gorrie, et. al.Claire L Gorrie ... Benjamin P Howden
06 Aug 2021
The Lancet Microbe | VOL. 2

MARS: improving multiple circular sequence alignment using refined sequences
Lorraine A K Ayad ... Solon P Pissis
BMC Genomics | VOL. 18
Lorraine A K Ayad, et. al.Lorraine A K Ayad ... Solon P Pissis
14 Jan 2017
BMC Genomics | VOL. 18

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Parsnp 2.0: scalable core-genome alignment for massive microbial datasets.

Abstract

Talk to us

Similar Papers

More From: Bioinformatics (Oxford, England)