Abstract

Spontaneous mutations are evolutionary engines as they generate variants for the evolutionary downstream processes that give rise to speciation and adaptation. Single nucleotide mutations (SNM) are the most abundant type of mutations among them. Here, we perform a meta-analysis to quantify the influence of selected global genomic parameters (genome size, genomic GC content, genomic repeat fraction, number of coding genes, gene count, and strand bias in prokaryotes) and local genomic features (local GC content, repeat content, CpG content and the number of SNM at CpG islands) on spontaneous SNM rates across the tree of life (prokaryotes, unicellular eukaryotes, multicellular eukaryotes) using wild-type sequence data in two different taxon classification systems. We find that the spontaneous SNM rates in our data are correlated with many genomic features in prokaryotes and unicellular eukaryotes irrespective of their sample sizes. On the other hand, only the number of coding genes was correlated with the spontaneous SNM rates in multicellular eukaryotes primarily contributed by vertebrates data. Considering local features, we notice that local GC content and CpG content significantly were correlated with the spontaneous SNM rates in the unicellular eukaryotes, while local repeat fraction is an important feature in prokaryotes and certain specific uni- and multi-cellular eukaryotes. Such predictive features of the spontaneous SNM rates often support non-linear models as the best fit compared to the linear model. We also observe that the strand asymmetry in prokaryotes plays an important role in determining the spontaneous SNM rates but the SNM spectrum does not.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call