Abstract

A substantial fraction of the human genome is difficult to interrogate with short-read DNA sequencing technologies due to paralogy, complex haplotype structures, or tandem repeats. Long-read sequencing technologies, such as Oxford Nanopore's MinION, enable direct measurement of complex loci without introducing many of the biases inherent to short-read methods, though they suffer from relatively lower throughput. This limitation has motivated recent efforts to develop amplification-free strategies to target and enrich loci of interest for subsequent sequencing with long reads. Here, we present CaBagE, a method for target enrichment that is efficient and useful for sequencing large, structurally complex targets. The CaBagE method leverages the stable binding of Cas9 to its DNA target to protect desired fragments from digestion with exonuclease. Enriched DNA fragments are then sequenced with Oxford Nanopore's MinION long-read sequencing technology. Enrichment with CaBagE resulted in a median of 116X coverage (range 39-416) of target loci when tested on five genomic targets ranging from 4-20kb in length using healthy donor DNA. Four cancer gene targets were enriched in a single reaction and multiplexed on a single MinION flow cell. We further demonstrate the utility of CaBagE in two ALS patients with C9orf72 short tandem repeat expansions to produce genotype estimates commensurate with genotypes derived from repeat-primed PCR for each individual. With CaBagE there is a physical enrichment of on-target DNA in a given sample prior to sequencing. This feature allows adaptability across sequencing platforms and potential use as an enrichment strategy for applications beyond sequencing. CaBagE is a rapid enrichment method that can illuminate regions of the 'hidden genome' underlying human disease.

Highlights

  • While short-read DNA sequencing technologies have enabled the discovery of genetic variants underlying numerous rare genetic disorders [1, 2], a large fraction of the human genome remains very difficult to interrogate with short-reads

  • We found that selecting for larger fragments after adapter ligation using the Oxford Nanopore Technologies (ONT) Long Fragment Buffer, which selects for fragments longer than 3kb, resulted in fewer reads overall and fewer on-target reads despite target fragments being larger than 3kb

  • By relying on the binding kinetics of the Cas9 enzyme to its RNA-guided target, Cas9 Background Elimination (CaBagE) can flexibly enrich for targets so long as most fragments in the input DNA are intact between Cas9 binding sites

Read more

Summary

Introduction

While short-read DNA sequencing technologies have enabled the discovery of genetic variants underlying numerous rare genetic disorders [1, 2], a large fraction of the human genome remains very difficult to interrogate with short-reads. These so-called “hidden” regions are difficult to sequence with short-read technologies owing to a mixture of sequence paralogy, complex haplotype structures, and tandem repeats [3, 4]. Polymorphic mobile element insertions are difficult to map, as multiple copies exist throughout the genome and yet broad phenotypic effects of this variation have been suggested [6, 7]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call