Abstract

BackgroundThe core genome consists of genes shared by the vast majority of a species and is therefore assumed to have been subjected to substantially stronger purifying selection than the more mobile elements of the genome, also known as the accessory genome. Here we examine intragenic base composition differences in core genomes and corresponding accessory genomes in 36 species, represented by the genomes of 731 bacterial strains, to assess the impact of selective forces on base composition in microbes. We also explore, in turn, how these results compare with findings for whole genome intragenic regions.ResultsWe found that GC content in coding regions is significantly higher in core genomes than accessory genomes and whole genomes. Likewise, GC content variation within coding regions was significantly lower in core genomes than in accessory genomes and whole genomes. Relative entropy in coding regions, measured as the difference between observed and expected trinucleotide frequencies estimated from mononucleotide frequencies, was significantly higher in the core genomes than in accessory and whole genomes. Relative entropy was positively associated with coding region GC content within the accessory genomes, but not within the corresponding coding regions of core or whole genomes.ConclusionThe higher intragenic GC content and relative entropy, as well as the lower GC content variation, observed in the core genomes is most likely associated with selective constraints. It is unclear whether the positive association between GC content and relative entropy in the more mobile accessory genomes constitutes signatures of selection or selective neutral processes.

Highlights

  • The core genome consists of genes shared by the vast majority of a species and is assumed to have been subjected to substantially stronger purifying selection than the more mobile elements of the genome, known as the accessory genome

  • A strong positive correlation between fitness and GC content was found in Escherichia coli over-expressing synthetic versions of a GFP gene with varying GC content, suggesting that increased GC content in bacteria may be associated with increased selective pressures [24]

  • The regression model with %GC as the response and intragenic region as the explanatory variable indicated that GC content was significantly higher (See Figs. 2 and 3 as well as Additional file 2), on average, in the core part of the genome (p < 0.001) than the whole (p < 0.001), and accessory genomes (p < 0.001)

Read more

Summary

Introduction

The core genome consists of genes shared by the vast majority of a species and is assumed to have been subjected to substantially stronger purifying selection than the more mobile elements of the genome, known as the accessory genome. Genomic nucleotide content varies greatly in bacteria, with GC content (number of same strand guanine + cytosine sites divided by DNA sequence length) ranging from less than 13% to more than 75% between individual species [1]. Variation in nucleotide composition can be substantial within individual genomes [2]. Optimal growth temperature may influence genomic DNA composition and this is a field of debate [16,17,18,19], there is some evidence for a role of growth temperature in shaping the GC content of individual genes [20] and ribosomal RNA [21]. GC-richness may be driven by selection for more stable DNA as stacking

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.