Phylonium: fast estimation of evolutionary distances from large samples of similar genomes.

Fabian Klötzl,Bernhard Haubold,Peter Robinson

doi:10.1093/bioinformatics/btz903

Fabian Klötzl, Bernhard Haubold + Show 1 more

Open Access

https://doi.org/10.1093/bioinformatics/btz903

Copy DOI

Abstract

MotivationTracking disease outbreaks by whole-genome sequencing leads to the collection of large samples of closely related sequences. Five years ago, we published a method to accurately compute all pairwise distances for such samples by indexing each sequence. Since indexing is slow, we now ask whether it is possible to achieve similar accuracy when indexing only a single sequence.ResultsWe have implemented this idea in the program phylonium and show that it is as accurate as its predecessor and roughly 100 times faster when applied to all 2678 Escherichia coli genomes contained in ENSEMBL. One of the best published programs for rapidly computing pairwise distances, mash, analyzes the same dataset four times faster but, with default settings, it is less accurate than phylonium.Availability and implementation Phylonium runs under the UNIX command line; its C++ sources and documentation are available from github.com/evolbioinf/phylonium.Supplementary information Supplementary data are available at Bioinformatics online.

Highlights

Methods for rapid sequence comparison are a staple of bioinformatics, if not its raison d’etre
We ask whether it is possible to achieve similar accuracy when indexing only a single sequence. We have implemented this idea in the program phylonium and show that it is as accurate as its predecessor and roughly 100 times faster when applied to all 2678 Escherichia coli genomes contained in ENSEMBL
The tree of 29 Escherichia coli/Shigella genomes with an average length of 4.9 Mb in Figure 2A is based on a mugsy alignment computed in 2 h 18 min

Summary

Introduction

Methods for rapid sequence comparison are a staple of bioinformatics, if not its raison d’etre. Genome aligners like mugsy have allowed the comparison of whole genome samples (Angiuoli and Salzberg, 2011). The tree of 29 Escherichia coli/Shigella genomes with an average length of 4.9 Mb in Figure 2A is based on a mugsy alignment computed in 2 h 18 min. This large run time illustrates that genome aligners like mugsy do not scale well with sample size. Distance matrices can be computed from genomes without first explicitly aligning all residues, leading to much faster methods of phylogeny reconstruction

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computer applications in the biosciences : CABIOS	Publication Date: Dec 2, 2019
Citations: 18	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Phylonium: fast estimation of evolutionary distances from large samples of similar genomes.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computer applications in the biosciences : CABIOS

Lead the way for us

Similar Papers

Predictive approaches for the UNIX command line: curating and exploiting domain knowledge in semantics deficit data
Thoudam Doren Singh ... Abdullah Faiz Ur Rahman Khilji
Multimedia Tools and Applications | VOL. 80
Thoudam Doren Singh, et. al.Thoudam Doren Singh ... Abdullah Faiz Ur Rahman Khilji
09 Nov 2020
Multimedia Tools and Applications | VOL. 80

Interacting with the Unix Command Line
Hamish Sanderson ... Hanaan Rosenthal
-
Hamish Sanderson, et. al.Hamish Sanderson ... Hanaan Rosenthal
01 Jan 2009
01 Jan 2009

Fast Phylogeny Reconstruction from Genomes of Closely Related Microbes.
Bernhard Haubold ... Fabian Klötzl
Methods in molecular biology (Clifton, N.J.) | VOL. 2242
Bernhard Haubold, et. al.Bernhard Haubold ... Fabian Klötzl
24 Feb 2012
Methods in molecular biology (Clifton, N.J.) | VOL. 2242

VAGUE: a graphical user interface for the Velvet assembler
D R Powell ... T Seemann
Computer applications in the biosciences : CABIOS | VOL. 29
D R Powell, et. al.D R Powell ... T Seemann
17 Nov 2012
Computer applications in the biosciences : CABIOS | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Phylonium: fast estimation of evolutionary distances from large samples of similar genomes.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computer applications in the biosciences : CABIOS