Abstract

Using the 2.6 million single nucleotide polymorphism (SNP) genotype datasets from Perlegen Sciences and the Haplotype Map (HapMap) project (Phase I freeze), a probabilistic search for the landscape exhibited by positive Darwinian selection was conducted (Wang et al., 2006). By sorting each high frequency allele by homozygosity, we search for the expected decay of adjacent SNP linkage disequilibrium (LD) at recently selected alleles, eliminating the need for inferring haplotype. We designate this approach the LD decay (LDD) test. Cluster analysis indicates that approximately 3000 sites of recent inferred selection are present in human DNA, representing approximately 1800 genes. Prior simulation studies (Wang et al., 2006) indicate that this novel LDD test, at the Mb scale employed, effectively distinguishes selection from other causes of extensive LD, such as inversions, population bottlenecks and admixture. Based on over-representation analysis, these prior studies have shown that several predominant biological themes are common in inferred selected alleles, including genes involved with DNA metabolism and repair. Here, we show that three of these DNA repair genes, ERCC8, Fanconi Anemia Complementation Group C ( FANCC), and RAD51C, exhibit genomic architectures consistent with ongoing balanced selection over the last 40,000–50,000 years.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call