Abstract

A detailed understanding of the genome-wide variability of single-nucleotide germline mutation rates is essential to studying human genome evolution. Here, we use ~36 million singleton variants from 3560 whole-genome sequences to infer fine-scale patterns of mutation rate heterogeneity. Mutability is jointly affected by adjacent nucleotide context and diverse genomic features of the surrounding region, including histone modifications, replication timing, and recombination rate, sometimes suggesting specific mutagenic mechanisms. Remarkably, GC content, DNase hypersensitivity, CpG islands, and H3K36 trimethylation are associated with both increased and decreased mutation rates depending on nucleotide context. We validate these estimated effects in an independent dataset of ~46,000 de novo mutations, and confirm our estimates are more accurate than previously published results based on ancestrally older variants without considering genomic features. Our results thus provide the most refined portrait to date of the factors contributing to genome-wide variability of the human germline mutation rate.

Highlights

  • A detailed understanding of the genome-wide variability of single-nucleotide germline mutation rates is essential to studying human genome evolution

  • Evaluating these differences in an independent dataset of ~46,000 de novo mutations, collected from two published familybased whole-genome sequencing (WGS) studies[9,12], we find that extremely rare variants (ERVs)-derived estimates yield a significantly more accurate portrait of present-day germline mutation rate heterogeneity

  • We identified and removed 156 samples which appeared to be technical outliers, resulting in a final call set of 35,574,417 autosomal ERVs from 3560 individuals (Methods)

Read more

Summary

Introduction

A detailed understanding of the genome-wide variability of single-nucleotide germline mutation rates is essential to studying human genome evolution. GC content, DNase hypersensitivity, CpG islands, and H3K36 trimethylation are associated with both increased and decreased mutation rates depending on nucleotide context We validate these estimated effects in an independent dataset of ~46,000 de novo mutations, and confirm our estimates are more accurate than previously published results based on ancestrally older variants without considering genomic features. The gold standard for studying the germline mutation rate in humans is direct observation of de novo mutations from familybased whole-genome sequencing (WGS) data[9,10,11,12] These studies have produced accurate estimates of the genome-wide average mutation rate (~1 − 1.5 × 10−8 mutations per base pair per generation) and uncovered some of the mutagenic effects of genomic features. ERVs represent a relatively unbiased sample of recent mutations and are far more numerous than de novo mutations collected in family-based WGS studies

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call