Abstract

BackgroundCombined metagenomic and metatranscriptomic datasets make it possible to study the molecular evolution of diverse microbial species recovered from their native habitats. The link between gene expression level and sequence conservation was examined using shotgun pyrosequencing of microbial community DNA and RNA from diverse marine environments, and from forest soil.ResultsAcross all samples, expressed genes with transcripts in the RNA sample were significantly more conserved than non-expressed gene sets relative to best matches in reference databases. This discrepancy, observed for many diverse individual genomes and across entire communities, coincided with a shift in amino acid usage between these gene fractions. Expressed genes trended toward GC-enriched amino acids, consistent with a hypothesis of higher levels of functional constraint in this gene pool. Highly expressed genes were significantly more likely to fall within an orthologous gene set shared between closely related taxa (core genes). However, non-core genes, when expressed above the level of detection, were, on average, significantly more highly expressed than core genes based on transcript abundance normalized to gene abundance. Finally, expressed genes showed broad similarities in function across samples, being relatively enriched in genes of energy metabolism and underrepresented by genes of cell growth.ConclusionsThese patterns support the hypothesis, predicated on studies of model organisms, that gene expression level is a primary correlate of evolutionary rate across diverse microbial taxa from natural environments. Despite their complexity, meta-omic datasets can reveal broad evolutionary patterns across taxonomically, functionally, and environmentally diverse communities.

Highlights

  • Combined metagenomic and metatranscriptomic datasets make it possible to study the molecular evolution of diverse microbial species recovered from their native habitats

  • Expressed genes evolve slowly The relationship between gene expression and sequence conservation was examined for protein-coding genes in coupled metagenome and metatranscriptome datasets generated by shotgun pyrosequencing of microbial community DNA and RNA, respectively

  • Amino acid identity relative to a top match reference sequence identified by BLASTX against the National Center for Biotechnology Information nonredundant protein database (NCBI-nr) is used to estimate sequence conservation

Read more

Summary

Introduction

Combined metagenomic and metatranscriptomic datasets make it possible to study the molecular evolution of diverse microbial species recovered from their native habitats. A diverse range of factors has been postulated to affect the rate of sequence evolution within individual genomes, including mutation and recombination rate [3], genetic contributions to fitness (that is, gene essentiality) [4], Deep-coverage sequencing of microbial community DNA and RNA (metagenomes and metatranscriptomes) provides an unprecedented opportunity to explore protein-coding genes across diverse organisms from natural populations. Such studies have yielded valuable insight into the genetic potential and functional activity of natural communities [15,16,17,18,19], but far have been applied only sparingly to questions of evolution. We compare microbial metagenomic and metatranscriptomic datasets from marine and terrestrial habitats to explore fundamental properties of sequence evolution in the expressed gene set

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call