Abstract

Next Generation Sequencing (NGS) has been widely implemented in biological research and has made a profound impact on patient care. One of the essential NGS applications is to identify disease-causing sequence variants, where high coverage and accuracy are needed. Here, we reported a novel NGS pipeline, termed a Sequencing System of Digitalized Barcode Encrypted Single-stranded Library from Extremely Low (quality and quantity) DNA Input with Probe-based DNA Enrichment by RNA probes targeting DNA duplex (DEEPER-Seq). This method combines an ultra-sensitive single-stranded library construction with barcoding error correction, termed DEEPER-Library; and a DNA capture approach using RNA probes targeting both DNA strands, termed DEEPER-Capture. DEEPER-Seq can create NGS libraries from as little as 20 pg DNA with PCR error correcting capabilities, and capture target sequences at an average ratio of 29.2% by targeting both DNA strands simultaneously with an over 98.6% coverage. Our method tags and sequences each of the two strands of a DNA duplex independently and only scores mutations that are found at the same position in both strands, which allows us to identify mutations with allelic fractions down to 0.03% in a whole exome sequencing (WES) study with a background error rate of one artificial error per 4.8 × 109 nucleotides.

Highlights

  • Next Generation Sequencing (NGS) is revolutionizing biomedical research and clinical patient care by analyzing billions of DNA base pairs in a high-throughput but relatively low-cost manner[1]

  • NGS data will inevitably contain artificial errors (~1% of the bases) that arise during Polymerase chain reaction (PCR) amplifications in sample preparation and sequencing steps, and such errors have to be corrected in order to identify rare SNVs

  • The second is the enrichment of both DNA strands from the same DNA molecule (DNA duplex) through complementary RNA probes (DEEPER-Capture)

Read more

Summary

Introduction

NGS is revolutionizing biomedical research and clinical patient care by analyzing billions of DNA base pairs in a high-throughput but relatively low-cost manner[1]. NGS data will inevitably contain artificial errors (~1% of the bases) that arise during PCR amplifications in sample preparation and sequencing steps, and such errors have to be corrected in order to identify rare SNVs. Secondly, to observe a rare mutation, sufficient depth of coverage is necessary, but is usually not achieved. Single strand library construction is intrinsically much more sensitive than a standard double-stranded library construction workflow and has been developed to prepare tough and limited DNA materials for NGS analysis[11, 12]. DNA exists naturally as a double-stranded molecule, where one strand is encoded as a complementary molecule to the other strand Based on this information redundancy, if the sequences of the two strands can be determined individually, it is possible to correct most if not all PCR and sequencing errors by calling a perfect match between the two complementary DNA strands. The background error rate of such methodology can be estimated as

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call