Abstract

CRISPR/Cas9 is emerging as one of the most-used methods of genome modification in organisms ranging from bacteria to human cells. However, the efficiency of editing varies tremendously site-to-site. A recent report identified a novel motif, called the 3'GG motif, which substantially increases the efficiency of editing at all sites tested in C. elegans. Furthermore, they highlighted that previously published gRNAs with high editing efficiency also had this motif. I designed a python command-line tool, ngg2, to identify 3'GG gRNA sites from indexed FASTA files. As a proof-of-concept, I screened for these motifs in six model genomes: Saccharomyces cerevisiae, Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Mus musculus, and Homo sapiens. I also scanned the genomes of pig (Sus scrofa) and African elephant (Loxodonta africana) to demonstrate the utility in non-model organisms. I identified more than 60 million single match 3'GG motifs in these genomes. Greater than 61% of all protein coding genes in the reference genomes had at least one unique 3'GG gRNA site overlapping an exon. In particular, more than 96% of mouse and 93% of human protein coding genes have at least one unique, overlapping 3'GG gRNA. These identified sites can be used as a starting point in gRNA selection, and the ngg2 tool provides an important ability to identify 3'GG editing sites in any species with an available genome sequence.

Highlights

  • Genome engineering allows for the targeted deletion or modification by homology directed repair of a target locus

  • One of the most popular methods for genome manipulation is the clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR associated protein 9 (Cas9) system adapted from Streptococcus pyogenes

  • The resulting RNA is a hybrid of the pre-CRISPR RNA (crRNA) and the tracrRNA, and includes a 20 bp guide RNA sequence

Read more

Summary

INTRODUCTION

Genome engineering allows for the targeted deletion or modification by homology directed repair of a target locus. The introduction of a longer stem in part the sgRNA stem-loop structure and the flip of a single A in a polyA track of a separate sgRNA stem-loop, called the flip + extension (F + E) sgRNA design, resulted in increased Cas editing efficiency (Chen et al, 2013). Another improvement was reported that increases efficiency. There is already support for most model organisms, but leaves individuals working on less commonly studied species without a resource In this manuscript, I report a Python command-line tool, ngg, for identification of 3′GG gRNA motifs from indexed FASTA genome files. More than 83% of all protein coding genes in 7/8 species have at least one unique 3′GG gRNA overlapping it for potential editing

MATERIALS & METHODS
RESULTS
Findings
DISCUSSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call