Abstract

BackgroundInferring phylogenetic trees for newly recovered genomes from metagenomic samples is very useful in determining the identities of uncultivated microorganisms. Even though 16S ribosomal RNA small subunit genes have been established as “gold standard” markers for inferring phylogenetic trees, they usually cannot be assembled very well in metagenomes due to shared regions among 16S genes. Using single-copy marker genes to build genome trees has become increasingly popular for uncultivated species. Predefined marker gene sets were discovered and have been applied in various genomic studies; however these gene sets might not be adequate for novel, uncultivated, draft, or incomplete genomes. The automatic identification of marker gene sets among a set of genomes with different assembly qualities has thus become a very important task for inferring reliable phylogenetic relationships for microbial populations.ResultsA computational pipeline, ezTree, was developed to automatically identify single-copy marker genes for a group of genomes and build phylogenetic trees from the marker genes. Testing ezTree on a group of proteobacteria species revealed that ezTree was highly effective in pinpointing marker genes and constructing reliable trees for different groups of bacterial genomes. Applying ezTree to genomes that were recently recovered from metagenomes also showed that ezTree can help elucidate taxonomic relationships among newly recovered genomes and existing ones.ConclusionsThe development of ezTree can help scientists build reliable phylogenetic trees for uncultivated species retrieved from environmental samples. The uncovered single-copy marker genes may also provide crucial hints for understanding shared features of a group of microbes. The ezTree pipeline is freely available at https://github.com/yuwwu/ezTree under a GNU GPLv3 license.

Highlights

  • Inferring phylogenetic trees for newly recovered genomes from metagenomic samples is very useful in determining the identities of uncultivated microorganisms

  • The Cluster of orthologous groups (COG) categories of marker genes were identified by mapping the protein family database (PFAM) profiles against COGs

  • The comparison of trees built for the set of Proteobacteria genomes provided by FastTree

Read more

Summary

Introduction

Inferring phylogenetic trees for newly recovered genomes from metagenomic samples is very useful in determining the identities of uncultivated microorganisms. Even though 16S ribosomal RNA small subunit genes have been established as “gold standard” markers for inferring phylogenetic trees, they usually cannot be assembled very well in metagenomes due to shared regions among 16S genes. Using single-copy marker genes to build genome trees has become increasingly popular for uncultivated species. One of the most popular techniques for investigating microbial communities is metagenomics, which seeks to directly obtain genomic sequences from the environments. To understand the microbial diversity of the recovered genomes and place them in the tree of life, phylogenetic marker genes have been used to build trees for the newly identified species. 16S ribosomal RNA small-subunit genes, one of the most widely adopted phylogenetic markers, have been established as “gold standard” for probing the taxonomy of newly recovered organisms and constructing phylogenetic trees [23, 24]. Genomes recovered from metagenomes usually lack 16S genes (or consist of only very short gene fragments), making it impossible or very difficult to build phylogenetic trees using 16S sequences

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.