Abstract

BackgroundWith the increasing availability of whole genome sequences, it is becoming more and more important to use complete genome sequences for inferring species phylogenies. We developed a new tool ComPhy, 'Composite Distance Phylogeny', based on a composite distance matrix calculated from the comparison of complete gene sets between genome pairs to produce a prokaryotic phylogeny.ResultsThe composite distance between two genomes is defined by three components: Gene Dispersion Distance (GDD), Genome Breakpoint Distance (GBD) and Gene Content Distance (GCD). GDD quantifies the dispersion of orthologous genes along the genomic coordinates from one genome to another; GBD measures the shared breakpoints between two genomes; GCD measures the level of shared orthologs between two genomes. The phylogenetic tree is constructed from the composite distance matrix using a neighbor joining method. We tested our method on 9 datasets from 398 completely sequenced prokaryotic genomes. We have achieved above 90% agreement in quartet topologies between the tree created by our method and the tree from the Bergey's taxonomy. In comparison to several other phylogenetic analysis methods, our method showed consistently better performance.ConclusionComPhy is a fast and robust tool for genome-wide inference of evolutionary relationship among genomes. It can be downloaded from .

Highlights

  • With the increasing availability of whole genome sequences, it is becoming more and more important to use complete genome sequences for inferring species phylogenies

  • Variance of orthlog definition To test the robustness of our ortholog definition, different variations of E-value cut-offs and sequence identities have been selected for performance evaluation

  • PFhigyulorgeen5etic trees based on different gene selections Phylogenetic trees based on different gene selections. (a) Phylogenetic tree of 13 bacterial species based on CTP synthase affiliates with Bacteroidates; (b) phylogenetic tree of 13 bacterial species based on glyA affiliates with Chlorobi; (c) phylogenetic tree of 13 bacterial species based on Chaperonim Hsp 60 affiliates with the superphylum BacteroidatesChlorobi. (d) phylogenetic tree of 13 bacterial species based on the whole-genome gene sets

Read more

Summary

Introduction

With the increasing availability of whole genome sequences, it is becoming more and more important to use complete genome sequences for inferring species phylogenies. Attempts to explicate the phylogeny of prokaryotes based on the ssu-rRNA have been by-and-large successful [3,4] Such molecules have proved to be very useful phylogenetic markers, mutational saturation is a problem due to the restricted length and limited number of mutation sites [5]. With the increasing availability of whole genome sequences, methods using vast amounts of phylogenetic information contained in complete genome sequences are becoming more and more important for inferring species phylogenies. Phylogenomics, i.e. using entire genomes to infer a species tree, represents the state of art for reconstructing phylogenies [11,12]

Objectives
Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.