Abstract

We used targeted next generation deep-sequencing (Safe Sequencing System) to measure ultra-rare de novo mutation frequencies in the human male germline by attaching a unique identifier code to each target DNA molecule. Segments from three different human genes (FGFR3, MECP2 and PTPN11) were studied. Regardless of the gene segment, the particular testis donor or the 73 different testis pieces used, the frequencies for any one of the six different mutation types were consistent. Averaging over the C>T/G>A and G>T/C>A mutation types the background mutation frequency was 2.6x10-5 per base pair, while for the four other mutation types the average background frequency was lower at 1.5x10-6 per base pair. These rates far exceed the well documented human genome average frequency per base pair (~10−8) suggesting a non-biological explanation for our data. By computational modeling and a new experimental procedure to distinguish between pre-mutagenic lesion base mismatches and a fully mutated base pair in the original DNA molecule, we argue that most of the base-dependent variation in background frequency is due to a mixture of deamination and oxidation during the first two PCR cycles. Finally, we looked at a previously studied disease mutation in the PTPN11 gene and could easily distinguish true mutations from the SSS background. We also discuss the limits and possibilities of this and other methods to measure exceptionally rare mutation frequencies, and we present calculations for other scientists seeking to design their own such experiments.

Highlights

  • De novo mutations in somatic cells and the germline have a major impact on human health

  • The primers contain a target-specific sequence, a string of randomized nucleotides called the Unique IDentifier (UID), and a short universal sequence required for Illumina sequencing

  • If only a small fraction of the final sequencing reads with the same UID has a mutation it is most likely due to DNA isolation, library preparation or some other NGS-related error, while a high proportion with the same mutation likely means the mutation was present in the original genomic DNA molecule [1]

Read more

Summary

Introduction

De novo mutations in somatic cells and the germline have a major impact on human health. Estimating these rare mutation frequencies using high-throughput DNA sequencing (NGS) is limited by the high technical error rates (non-biological processes) of 10−2 to 10−3 per nucleotide sequenced. Vogelstein and colleagues developed a strategy, called the Safe-Sequencing System (SSS) with the potential to trump this high error rate [1]. Estimating Exceptionally Rare Mutation Frequencies necessarily represent the official views of the National Cancer Institute or the National Institutes of Health. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.