Abstract

Phylogenetic trees are used to represent the evolutionary relationship among various groups of species. In this paper, a novel method for inferring prokaryotic phylogenies using multiple genomic information is proposed. The method is called CGCPhy and based on the distance matrix of orthologous gene clusters between whole-genome pairs. CGCPhy comprises four main steps. First, orthologous genes are determined by sequence similarity, genomic function, and genomic structure information. Second, genes involving potential HGT events are eliminated, since such genes are considered to be the highly conserved genes across different species and the genes located on fragments with abnormal genome barcode. Third, we calculate the distance of the orthologous gene clusters between each genome pair in terms of the number of orthologous genes in conserved clusters. Finally, the neighbor-joining method is employed to construct phylogenetic trees across different species. CGCPhy has been examined on different datasets from 617 complete single-chromosome prokaryotic genomes and achieved applicative accuracies on different species sets in agreement with Bergey's taxonomy in quartet topologies. Simulation results show that CGCPhy achieves high average accuracy and has a low standard deviation on different datasets, so it has an applicative potential for phylogenetic analysis.

Highlights

  • There are about 10 to 500 thousand species of prokaryotes living on the Earth today [1]

  • We propose a novel method called conserved gene cluster phylogenies (CGCPhy), which removes the genes potentially involved in horizontal gene transfer (HGT) events, for inferring prokaryotic phylogenies

  • We infer prokaryotic phylogenies by a novel method CGCPhy, which is based on the distance matrix of orthologous gene clusters between whole genome pairs

Read more

Summary

Introduction

There are about 10 to 500 thousand species of prokaryotes living on the Earth today [1]. Prokaryotes have a more complicated evolutionary relationship than Eukaryotes through their long existence. Owing to evolving in different environments, the prokaryotes have considerable diversity in both genetical and physical processes to adapt to different conditions. Phylogenies are used to represent the evolutionary relationship among various groups of species. Studying the phylogenies of different prokaryotes can help us understand the similarities and differences in genotype and phenotype among them. Woese and Fox first proposed molecular phylogeny of prokaryotes using the small subunit ribosomal RNA (SSU rRNA) universal distribution [2]. RRNAs were commonly recommended as the molecular standard for reconstructing phylogenies [3, 4]. Phylogeny of prokaryotes inferred by rRNAs or genes has been immensely successful

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call