Abstract

We present a genome-wide comparative and comprehensive analysis of three different sequencing methods (conventional next generation sequencing (NGS), tag-based single strand sequencing (e.g., SSCS), and Duplex Sequencing for investigating mitochondrial mutations in human breast epithelial cells. Duplex Sequencing produces a single strand consensus sequence (SSCS) and a duplex consensus sequence (DCS) analysis, respectively. Our study validates that although high-frequency mutations are detectable by all the three sequencing methods with the similar accuracy and reproducibility, rare (low-frequency) mutations are not accurately detectable by NGS and SSCS. Even with conservative bioinformatical modification to overcome the high error rate of NGS, the NGS frequency of rare mutations is 7.0 × 10−4. The frequency is reduced to 1.3 × 10−4 with SSCS and is further reduced to 1.0 × 10−5 using DCS. Rare mutation context spectra obtained from NGS significantly vary across independent experiments, and it is not possible to identify a dominant mutation context. In contrast, rare mutation context spectra are consistently similar in all independent DCS experiments. We have systematically identified heat-induced artifactual variants and corrected the artifacts using Duplex Sequencing. Specific sequence contexts were analyzed to examine the effects of neighboring bases on the accumulation of heat-induced artifactual variants. All of these artifacts are stochastically occurring rare mutations. C > A/G > T, a signature of oxidative damage, is the most increased (170-fold) heat-induced artifactual mutation type. Our results strongly support the claim that Duplex Sequencing accurately detects low-frequency mutations and identifies and corrects artifactual mutations introduced by heating during DNA preparation.

Highlights

  • Next-generation sequencing (NGS) has rapidly transformed entire areas of basic research and therapeutic applications by making large scale genomic studies feasible through reduced cost and faster turnaround time [1,2]

  • The average number of nucleotides sequenced at each genome position of all conventional NGS, single strand consensus sequence (SSCS), and duplex consensus sequence (DCS) analyses were calculated as the total number of nucleotides sequenced divided by the mtDNA size of 16,569 bases

  • DDuCpSleaxnSaelyqsueesn.cing Identifies and Corrects the Heat-Induced Artifactual Variants Introduced During DNA Sample Preparation We investigated which specific types of artifactual variants are introduced during DNA sample preparation such as heat treatments and to what extent these artifacts can be corrected by Duplex Sequencing

Read more

Summary

Introduction

Next-generation sequencing (NGS) has rapidly transformed entire areas of basic research and therapeutic applications by making large scale genomic studies feasible through reduced cost and faster turnaround time [1,2]. A major impediment in investigating subclonal (low-frequency) mutations is that conventional NGS methods have high error rates (10−2 to 10−3), which obscure true mutations that occur less frequently than errors [3,4]. Duplex Sequencing examines both strands of DNA and scores mutations only if they are present on both strands of the same DNA molecule as complementary substitutions. This significantly reduces sequencing error rates to

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call