Abstract
Mobile element insertions (MEIs) are repetitive genomic sequences that contribute to genetic variation and can lead to genetic disorders. Targeted and whole-genome approaches using short-read sequencing have been developed to identify reference and non-reference MEIs; however, the read length hampers detection of these elements in complex genomic regions. Here, we pair Cas9-targeted nanopore sequencing with computational methodologies to capture active MEIs in human genomes. We demonstrate parallel enrichment for distinct classes of MEIs, averaging 44% of reads on-targeted signals and exhibiting a 13.4-54x enrichment over whole-genome approaches. We show an individual flow cell can recover most MEIs (97% L1Hs, 93% AluYb, 51% AluYa, 99% SVA_F, and 65% SVA_E). We identify seventeen non-reference MEIs in GM12878 overlooked by modern, long-read analysis pipelines, primarily in repetitive genomic regions. This work introduces the utility of nanopore sequencing for MEI enrichment and lays the foundation for rapid discovery of elusive, repetitive genetic elements.
Highlights
Our results showed that only one of the MinION flow cells (FAL11389 or FAO84736) was necessary to capture most of the known reference and estimated non-reference L1Hs subfamilies in the genome when considering intermediate values, indicating a very high sensitivity of guide RNA targeting in the experiments
Our results indicate that an individual MinION flow cell (FAL11389) is able to completely (100%) capture reference and non-reference instances of a single MEI subfamily (L1Hs) compared to sequencing on the smaller Flongle flow cells
We identified 12 additional L1Hs insertions as nanopore specific with ≥4 supporting reads that had been missed by the PacBio-MEI set with valid hallmarks, including target site duplication motifs, poly(A), EN Cleavage site, and empty site sequences, indicating a retrotransposition event induced by target-primed reverse transcription mechanism (TRPT) (Supplementary Data 9 and Supplementary Fig. 10)
Summary
Cas[9] targeted enrichment strategy for mobile elements using nanopore sequencing. We chose GM12878 (NA12878), a member of the CEPH pedigree number 1463 (GM12878, GM12891, and GM12892)[54], as the benchmark genome in this study. A pooled run of an unbarcoded, five MEI subfamily enrichment experiment can recover the vast majority of known reference and non-reference MEIs (96.5% L1Hs, 93.3% AluYb, 51.4% AluYa, 99.6% SVA_F, and 64.5% SVA_E) in the genome when considering elements with a ≤ 3 bp mismatch to the guide RNA Such an approach outperformed individual Flongle flow cells and approached the same capture level as the single MEI subfamily MinION run (Fig. 3, Supplementary Fig. 7, and Supplementary Data 3, 5, and 6). We find few additional L1Hs insertions by including additional on-target reads beyond approximately 30,000, using a cutoff of 15 supporting reads (Fig. 4a) This is consistent with the observation that the MinION (individual or pooled, usually with >100 k passed reads) has the ability to capture most non-reference L1Hs. In addition, there was no observable enrichment bias of MEI subfamilies from different flow cells (Supplementary Fig. 8). These observations demonstrate the high sensitivity of nanopore Cas[9] enrichment, suggesting its feasibility for MEI discovery in complex genomic regions
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.