Abstract

Genome-wide analysis has observed an excess of coincident single nucleotide polymorphisms (coSNPs) at human-chimpanzee orthologous positions, and suggested that this is due to cryptic variation in the mutation rate. While this phenomenon primarily corresponds with non-coding coSNPs, the situation in coding sequences remains unclear. Here we calculate the observed-to-expected ratio of coSNPs (coSNPO/E) to estimate the prevalence of human-chimpanzee coSNPs, and show that the excess of coSNPs is also present in coding regions. Intriguingly, coSNPO/E is much higher at zero-fold than at nonzero-fold degenerate sites; such a difference is due to an elevation of coSNPO/E at zero-fold degenerate sites, rather than a reduction at nonzero-fold degenerate ones. These trends are independent of chimpanzee subpopulation, population size, or sequencing techniques; and hold in broad generality across primates. We find that this discrepancy cannot fully explained by sequence contexts, shared ancestral polymorphisms, SNP density, and recombination rate, and that coSNPO/E in coding sequences is significantly influenced by purifying selection. We also show that selection and mutation rate affect coSNPO/E independently, and coSNPs tend to be less damaging and more correlated with human diseases than non-coSNPs. These suggest that coSNPs may represent a “signature” during primate protein evolution.

Highlights

  • Genome-wide analysis has observed an excess of coincident single nucleotide polymorphisms at human-chimpanzee orthologous positions, and suggested that this is due to cryptic variation in the mutation rate

  • After rejecting the possible explanations including sequence contexts, shared ancestral polymorphisms, density of single single nucleotide polymorphisms (SNPs), and recombination rate, we showed that: (i) the strength of selective constraints was positively correlated with coSNPO/E at zero-fold degenerate sites; (ii) the level of discrepancy of coSNPO/E between zero-fold degenerate sites and nonzero-fold degenerate sites increased with increasing the strength of selective constraints; and (iii) selection and mutation rate affected coSNPO/E independently

  • We found that 86% (9,615) of the identified coding SNPs were novel to the chimpanzee dbSNP (Build 136), and that 29% (3,249) of them were previously uncharacterized in the published chimpanzee SNP datasets (CdbSNP, CE12, CW5, CW10, and CW25 SNPs)

Read more

Summary

Introduction

Genome-wide analysis has observed an excess of coincident single nucleotide polymorphisms (coSNPs) at human-chimpanzee orthologous positions, and suggested that this is due to cryptic variation in the mutation rate. Previous studies have showed that there is an excess of coincident SNPs (coSNPs) between human and chimpanzee, which are human-chimpanzee orthologous sites observed to have a SNP in both species[2,3] This observation cannot be fully explained by the CpG effect, GC content, simple contextual effects (such as effects of neighboring nucleotides), shared ancestral polymorphisms, natural selection, or technical artifacts, leaving a cryptic nature of mutation rate as the most likely explanation for this bias[2,3,4,5]. Since population size is highly associated with the evolutionary dynamics of weakly-selected mutations[7], we controlled for this www.nature.com/scientificreports/

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call