Abstract

Analysis of structural variations (SVs) is important to understand mutations underlying genetic disorders and pathogenic conditions. However, characterizing SVs using short-read, high-throughput sequencing technology is difficult. Although long-read sequencing technologies are being increasingly employed in characterizing SVs, their low throughput and high costs discourage widespread adoption. Sequence motif-based optical mapping in nanochannels is useful in whole-genome mapping and SV detection, but it is not possible to precisely locate the breakpoints or estimate the copy numbers. We present here a universal multicolor mapping strategy in nanochannels combining conventional sequence-motif labeling system with Cas9-mediated target-specific labeling of any 20-base sequences (20mers) to create custom labels and detect new features. The sequence motifs are labeled with green fluorophores and the 20mers are labeled with red fluorophores. Using this strategy, it is possible to not only detect the SVs but also utilize custom labels to interrogate the features not accessible to motif-labeling, locate breakpoints, and precisely estimate copy numbers of genomic repeats. We validated our approach by quantifying the D4Z4 copy numbers, a known biomarker for facioscapulohumeral muscular dystrophy (FSHD) and estimating the telomere length, a clinical biomarker for assessing disease risk factors in aging-related diseases and malignant cancers. We also demonstrate the application of our methodology in discovering transposable long non-interspersed Elements 1 (LINE-1) insertions across the whole genome.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call