Abstract

BackgroundAs a key parameter of genome sequence variation, the GC content of bacterial genomes has been investigated for over half a century, and many hypotheses have been put forward to explain this GC content variation and its relationship to other fundamental processes. Previously, we classified eubacteria into dnaE-based groups (the dimeric combination of DNA polymerase III alpha subunits), according to a hypothesis where GC content variation is essentially governed by genome replication and DNA repair mechanisms. Further investigation led to the discovery that two major mutator genes, polC and dnaE2, may be responsible for genomic GC content variation. Consequently, an in-depth analysis was conducted to evaluate various potential intrinsic and extrinsic factors in association with GC content variation among eubacterial genomes.ResultsMutator genes, especially those with dominant effects on the mutation spectra, are biased towards either GC or AT richness, and they alter genomic GC content in the two opposite directions. Increased bacterial genome size (or gene number) appears to rely on increased genomic GC content; however, it is unclear whether the changes are directly related to certain environmental pressures. Certain environmental and bacteriological features are related to GC content variation, but their trends are more obvious when analyzed under the dnaE-based grouping scheme. Most terrestrial, plant-associated, and nitrogen-fixing bacteria are members of the dnaE1|dnaE2 group, whereas most pathogenic or symbiotic bacteria in insects, and those dwelling in aquatic environments, are largely members of the dnaE1|polV group.ConclusionOur studies provide several lines of evidence indicating that DNA polymerase III α subunit and its isoforms participating in either replication (such as polC) or SOS mutagenesis/translesion synthesis (such as dnaE2), play dominant roles in determining GC variability. Other environmental or bacteriological factors, such as genome size, temperature, oxygen requirement, and habitat, either play subsidiary roles or rely indirectly on different mutator genes to fine-tune the GC content. These results provide a comprehensive insight into mechanisms of GC content variation and the robustness of eubacterial genomes in adapting their ever-changing environments over billions of years.ReviewersThis paper was reviewed by Nicolas Galtier, Adam Eyre-Walker, and Eugene Koonin.

Highlights

  • As a key parameter of genome sequence variation, the GC content of bacterial genomes has been investigated for over half a century, and many hypotheses have been put forward to explain this GC content variation and its relationship to other fundamental processes

  • We discovered an excellent correlation between GC content variations and the dimeric combinations of DNA polymerase III alpha subunits, which showed that eubacteria can be grouped into different GC variable groups: the full-spectrum or dnaE1 group, the high-GC or dnaE2-dnaE1 group, and the low GC or polC-dnaE3 group [28]

  • The study of GC content variation focused on a dataset containing 364 non-redundant eubacterial genomes, rather than all of the bacterial genomes available in the public databases

Read more

Summary

Introduction

As a key parameter of genome sequence variation, the GC content of bacterial genomes has been investigated for over half a century, and many hypotheses have been put forward to explain this GC content variation and its relationship to other fundamental processes. Further investigation led to the discovery that two major mutator genes, polC and dnaE2, may be responsible for genomic GC content variation. There are several essential questions to be addressed concerning GC content and its variability How does it vary: randomly, gene-centrically, species-regulated, or selected? Mutations should generally conform to two patterns– global or transcript-centric–each derived from different mechanisms The former is attributable to DNA replication and global repair and the latter is mainly the result of transcription-coupled repair [10,11,12]. Concerning the fundamental role of the environment or habitat in species evolution [13,14,15], another way to study GC content variation is to differentiate intrinsic from extrinsic (mostly environmental) factors, and to measure their impacts on GC content variability and evolvability, both qualitatively and quantitatively. Different hypotheses have been proposed by numerous authors to explain why GC content varies and how it is related to different intrinsic and extrinsic factors [16,17,18,19,20,21,22,23,24,25,26,27,28]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call