Abstract

Single molecule sequencing (SMS) platforms enable base sequences to be read directly from individual strands of DNA in real-time. Though capable of long read lengths, SMS platforms currently suffer from low throughput compared to competing short-read sequencing technologies. Here, we present a novel strategy for sequencing library preparation, dubbed ConcatSeq, which increases the throughput of SMS platforms by generating long concatenated templates from pools of short DNA molecules. We demonstrate adaptation of this technique to two target enrichment workflows, commonly used for oncology applications, and feasibility using PacBio single molecule real-time (SMRT) technology. Our approach is capable of increasing the sequencing throughput of the PacBio RSII platform by more than five-fold, while maintaining the ability to correctly call allele frequencies of known single nucleotide variants. ConcatSeq provides a versatile new sample preparation tool for long-read sequencing technologies.

Highlights

  • The cost for sequencing DNA has decreased dramatically over the course of the last ten years at a rate outpacing Moore’s law

  • PacBio’s single molecule real-time (SMRT) technology distinguishes itself from other sequencing platforms in three main aspects: (1) during library preparation, closed circular DNA molecules are created by ligating hairpin adapters, termed SMRTbells, to both ends of double-stranded DNA target molecules, (2) these SMRTbells are bound to a sequencing primer and a DNA polymerase, and subsequently loaded as a complex into tiny sequencing units called zero-mode waveguides (ZMWs), and (3) the small volumes of the ZMWs allow real-time optical detection of fluorescently-labeled phospholinked nucleotides as they are incorporated by the DNA polymerase while a copy of the template is synthesized[18]

  • This presents a challenge to PacBio and other Single molecule sequencing (SMS) platforms when considering sequencing applications in which a large number of short DNA molecules, such as circulating tumor DNA20 or DNA extracted from formalin-fixed paraffin-embedded tissues[21], need to be sequenced

Read more

Summary

Introduction

The cost for sequencing DNA has decreased dramatically over the course of the last ten years at a rate outpacing Moore’s law. While we are fast approaching an era in which sequencing an entire human genome costs less than $1,0001, it currently still is not feasible to decipher large numbers of complex genomes due to reagent costs, informatics infrastructure, time for sample preparation, and sequencing To this end, multiple ‘target enrichment’ methods have been developed in recent years to selectively enrich for parts of the genome that contain relevant information of interest[2]. We present a novel method for SMS library preparation that concatenates pools of short DNA fragments (ranging between ~80 and 800 bp in size) into long concatemers using Gibson Assembly and demonstrate that our approach is capable of increasing the sequencing throughput of the PacBio RSII platform by more than five-fold. The method described here offers a novel approach to expand the versatility of SMS platforms by increasing the sequencing throughput of short fragments on these otherwise long-read sequencing technologies

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call