Cognac: rapid generation of concatenated gene alignments for phylogenetic inference from large, bacterial whole genome sequencing datasets

Ryan D Crawford,Evan S Snitkin

doi:10.1186/s12859-021-03981-4

Ryan D Crawford, Evan S Snitkin

Open Access

https://doi.org/10.1186/s12859-021-03981-4

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Feb 15, 2021
Citations: 10	License type: open-access

Affiliation: University of Michigan–Ann Arbor

Abstract

BackgroundThe quantity of genomic data is expanding at an increasing rate. Tools for phylogenetic analysis which scale to the quantity of available data are required. To address this need, we present cognac, a user-friendly software package to rapidly generate concatenated gene alignments for phylogenetic analysis.ResultsWe illustrate that cognac is able to rapidly identify phylogenetic marker genes using a data driven approach and efficiently generate concatenated gene alignments for very large genomic datasets. To benchmark our tool, we generated core gene alignments for eight unique genera of bacteria, including a dataset of over 11,000 genomes from the genus Escherichia producing an alignment with 1353 genes, which was constructed in less than 17 h.ConclusionsWe demonstrate that cognac presents an efficient method for generating concatenated gene alignments for phylogenetic analysis. We have released cognac as an R package (https://github.com/rdcrawford/cognac) with customizable parameters for adaptation to diverse applications.

Highlights

The quantity of genomic data is expanding at an increasing rate
multiple sequence alignment (MSA) is a foundational tool in many disciplines of biology, which aims to capture the relationships between residues of related biological sequences, and facilitate insights into the evolutionary or structural relationships between the sequences in the alignment
We present cognac, a novel datadriven method and rapid algorithm for identifying phylogenetic marker genes from whole genome sequences and generating concatenated gene alignments, which scales to extremely large datasets of greater than 11,000 bacterial genomes

Summary

Introduction

Tools for phylogenetic analysis which scale to the quantity of available data are required. To address this need, we present cognac, a user-friendly software package to rapidly generate concatenated gene alignments for phylogenetic analysis. It was quickly observed that individual gene trees are often inaccurate estimations of the species tree [3]. These incongruencies can arise from errors while building the tree, or from biological processes such as incomplete lineage sorting, hidden parology, and horizontal gene transfer [4]

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Cognac: rapid generation of concatenated gene alignments for phylogenetic inference from large, bacterial whole genome sequencing datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Comprehensive discovery of novel structured noncoding RNAs in 26 bacterial genomes
Kenneth I Brewer ... Ronald R Breaker
RNA Biology | VOL. ahead-of-print
Kenneth I Brewer, et. al.Kenneth I Brewer ... Ronald R Breaker
12 May 2021
RNA Biology | VOL. ahead-of-print

Identification of bacterial isolates from a public hospital in Australia using complexity-reduced genotyping
Berenice Talamantes-Becerra ... Arthur Georges
Journal of Microbiological Methods | VOL. 160
Berenice Talamantes-Becerra, et. al.Berenice Talamantes-Becerra ... Arthur Georges
17 Mar 2019
Journal of Microbiological Methods | VOL. 160

Editorial: Z-curve Applications in Genome Analysis.
Chun-Ting Zhang
Current genomics | VOL. 15
Chun-Ting ZhangChun-Ting Zhang
01 Apr 2014
Current genomics | VOL. 15

Genome-scale rates of evolutionary change in bacteria.
Sebastian Duchêne ... Kathryn E Holt
Microbial Genomics | VOL. 2
Sebastian Duchêne, et. al.Sebastian Duchêne ... Kathryn E Holt
30 Nov 2016
Microbial Genomics | VOL. 2

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Cognac: rapid generation of concatenated gene alignments for phylogenetic inference from large, bacterial whole genome sequencing datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics