Abstract

Heterozygous mutations within homozygous sequences descended from a recent common ancestor offer a way to ascertain de novo mutations across multiple generations. Using exome sequences from 3222 British-Pakistani individuals with high parental relatedness, we estimate a mutation rate of 1.45 ± 0.05 × 10−8 per base pair per generation in autosomal coding sequence, with a corresponding non-crossover gene conversion rate of 8.75 ± 0.05 × 10−6 per base pair per generation. This is at the lower end of exome mutation rates previously estimated in parent–offspring trios, suggesting that post-zygotic mutations contribute little to the human germ-line mutation rate. We find frequent recurrence of mutations at polymorphic CpG sites, and an increase in C to T mutations in a 5ʹ CCG 3ʹ to 5ʹ CTG 3ʹ context in the Pakistani population compared to Europeans, suggesting that mutational processes have evolved rapidly between human populations.

Highlights

  • Using exome sequences from 3222 British-Pakistani individuals with high parental relatedness, we estimate a mutation rate of 1.45 ± 0.05 × 10−8 per base pair per generation in autosomal coding sequence, with a corresponding non-crossover gene conversion rate of 8.75 ± 0.05 × 10−6 per base pair per generation

  • Possible shortcomings include: (a) small sample sizes, both in terms of the number of individuals the estimate is obtained from as well as the number of true de novo mutations (DNMs) detected; (b) inaccurate characterization of the false negative (FN) or false positive (FP) rates, perhaps because of comparisons of sequencing data with different properties from different individuals; (c) consideration only of mutations occurring in a single generation, leading to incomplete ascertainment of post-zygotic mutations in parents or offspring[6]; (d) incomplete allowance for the correlation with paternal age; (e) the inclusion of diseased individuals who might have a higher rate of DNMs; or (f) failure to account for gene conversion events

  • We restricted our analysis to autosomal single-nucleotide substitutions with the same genotype call from both samtools[10] and Genome Analysis Toolkit (GATK)[11] when calling across all samples

Read more

Summary

Introduction

Using exome sequences from 3222 British-Pakistani individuals with high parental relatedness, we estimate a mutation rate of 1.45 ± 0.05 × 10−8 per base pair per generation in autosomal coding sequence, with a corresponding non-crossover gene conversion rate of 8.75 ± 0.05 × 10−6 per base pair per generation. Possible shortcomings include: (a) small sample sizes, both in terms of the number of individuals the estimate is obtained from as well as the number of true de novo mutations (DNMs) detected; (b) inaccurate characterization of the false negative (FN) or false positive (FP) rates, perhaps because of comparisons of sequencing data with different properties from different individuals; (c) consideration only of mutations occurring in a single generation, leading to incomplete ascertainment of post-zygotic mutations in parents or offspring[6]; (d) incomplete allowance for the correlation with paternal age; (e) the inclusion of diseased individuals who might have a higher rate of DNMs; or (f) failure to account for gene conversion events To address these shortcomings, and to obtain an estimate which, like population-genetic approaches, averages over multiple generations and many mutational events, we adopted an approach based on observing heterozygous genotypes within sequence intervals inherited identical-by-descent (IBD) from a recent common ancestor (autozygous segments). As our estimate is one of the few data sets of DNMs obtained in a non-European population we examine differences in context-specific mutational spectra between human populations

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call