Abstract

Programmable DNA binding proteins (PDPs) that selectively bind user-defined nucleotide sequences can navigate protein domains to any desired target within the genome. Different functional domains can be fused to PDPs to achieve (1) activation or repression of transcription, (2) nicking or cleavage of DNA, or (3) modification or removal of epigenetic marks at the targeted genome spot. Given their broad applicability, the development of highly specific, easy-to-use PDPs has been a holy grail of biotechnology since the determination of the 3-D structure of zinc fingers (ZFs) bound to DNA in 1991 (Pavletich and Pabo, 1991Pavletich N.P. Pabo C.O Zinc finger-DNA recognition: crystal structure of a Zif268–DNA complex at 2.1 Å.Science. 1991; 252: 809-817Crossref PubMed Scopus (1747) Google Scholar). This structure suggested that tandemly arranged ZFs act as functionally independent modules with each finger binding three bases. However, the combination of fingers with experimentally defined base specificities into arrays revealed that ZFs are not functionally independent. Instead, they were found to behave in a context-dependent, and often unpredictable fashion requiring iterative cycles of selection to get ZF arrays with desired specificity (Joung and Sander, 2013Joung J.K. Sander J.D TALENs: a widely applicable technology for targeted genome editing.Nat. Rev. Mol. Cell Biol. 2013; 14: 49-55Crossref PubMed Scopus (1066) Google Scholar). By contrast, the more recently identified transcription activator-like effector proteins (TALEs) from the bacterial genus Xanthomonas bind DNA with functionally independent, highly similar 33–35 amino-acid repeat modules (Mak et al., 2013Mak A.N.-S. Bradley P. Bogdanove A.J. Stoddard B.L TAL effectors: function, structure, engineering and applications.Curr. Opin. Struct. Biol. 2013; 23: 93-99Crossref PubMed Scopus (85) Google Scholar). TALE repeats collectively form a super helix around the DNA, tracking along the sense strand, with residue 13 of each repeat making contact with a single DNA base. Residues in position 13 that bind preferentially to adenine, cytosine, guanine, or thymine have been identified and this TALE-code facilitates the assembly of repeat arrays with any desired specificity (Joung and Sander, 2013Joung J.K. Sander J.D TALENs: a widely applicable technology for targeted genome editing.Nat. Rev. Mol. Cell Biol. 2013; 14: 49-55Crossref PubMed Scopus (1066) Google Scholar). A major hurdle to the routine application of the TALE DNA binding domain has been that the assembly of genes encoding tandemly arranged repeats is difficult to achieve via standard cloning approaches. However, recently established hierarchical, multi-fragment ligation protocols facilitate rapid assembly of genes encoding TALE repeat arrays with desired DNA specificity (Joung and Sander, 2013Joung J.K. Sander J.D TALENs: a widely applicable technology for targeted genome editing.Nat. Rev. Mol. Cell Biol. 2013; 14: 49-55Crossref PubMed Scopus (1066) Google Scholar). Most recently, the Cas9 protein from Streptococcus pyogenes has emerged as a new and promising PDP (Jinek et al., 2012Jinek M. Chylinski K. Fonfara I. Hauer M. Doudna J.A. Charpentier E A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity.Science. 2012; 337: 816-821Crossref PubMed Scopus (9398) Google Scholar). Cas9 is part of an adaptable bacterial immune system that mediates cleavage of foreign viral DNA and plasmids. Short fragments of such foreign sequences, termed protospacers, integrate into the clustered regularly interspaced short palindromic repeat (CRISPR) locus of the bacterial genome. Transcribed CRISPR RNAs (crRNAs) bind to auxiliary trans-activating crRNAs (tracrRNA) and these RNA hybrids direct sequence specificity of the CRISPR-associated (Cas) 9 endonuclease. In Cas9-based designer nucleases, the specificity is mediated by a single guide RNA (gRNA) that mimics the natural crRNA–tracrRNA hybrid. In such RNA-guided endonucleases (RGENs), reprogramming of DNA specificity requires no changes in the Cas9 protein, but only in the recombinant gRNA. By contrast, reprogramming of TALE- or ZF-based PDP architectures requires the modification of multiple DNA-interacting residues that are spread across these DNA binding scaffolds. The simplicity with which RGENs can be programmed explains the significant interest in this approach from the life-science community. Notably, a catalytically dead Cas9 lacking endonuclease activity retains its gRNA-mediated sequence specificity (Qi et al., 2013Qi L.S. Larson M.H. Gilbert L.A. Doudna J.A. Weissman J.S. Arkin A.P. Lim W.A Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression.Cell. 2013; 152: 1173-1183Abstract Full Text Full Text PDF PubMed Scopus (3020) Google Scholar). Thus, cleavage-deprived Cas9 derivatives should have the same broad potential for application as ZF and TALE-based PDPs and may be used to interfere with transcription or to modulate the epigenetic status of DNA. PDPs can be used in many different functional applications but have been used predominantly in the context of designer nucleases as tools for precision genome engineering. Designer nucleases create double-strand breaks (DSBs) at predefined DNA sequences and the cell’s innate error-prone non-homologous end joining (NHEJ) repair system seals the break, typically producing mutations at the cleavage site. Alternatively, but less frequently, homology-based repair (HR) occurs, which can be exploited to recombine desired sequences into a given cleavage site. How can we use different PDPs for genome editing purposes? As mentioned, the native Cas9 is an endonuclease and thus expression of a custom guide RNA matching to a given DNA target converts Cas9 into a designer nuclease with desired specificity (Figure 1). By contrast, native ZFs and TALEs are transcriptional activators that gain cleavage activity typically due to translational fusion to a FokI nuclease domain (ZF nuclease, ZFN; TALE nuclease, TALEN). The FokI nuclease domain is devoid of DNA binding activity and thus the TALE or ZF domain confers the target specificity of the fusion protein. An important feature of the FokI domain is that it cleaves DNA only as a dimer. FokI dimerization and thus cleavage is accomplished by targeting two distinct TALE– or ZF–FokI fusion proteins to DNA sequences flanking the desired cleavage site (Figure 1). The benefit of this architecture is that cleavage occurs only if both fused PDPs bind to the desired genome location, thus resulting in additive specificity from two DNA binding domains. Fusions to alternative cleavage domains have recently allowed the creation of monomeric TALE-derived nickases and nucleases (Beurdeley et al., 2013Beurdeley M. Bietz F. Li J. Thomas S. Stoddard T. Juillerat A. Zhang F. Voytas D.F. Duchateau P. Silva G.H Compact designer TALENs for efficient genome engineering.Nat. Commun. 2013; 4: 1762Crossref PubMed Scopus (73) Google Scholar; Gabsalilow et al., 2013Gabsalilow L. Schierling B. Friedhoff P. Pingoud A. Wende W Site- and strand-specific nicking of DNA by fusion proteins derived from MutH and I–SceI or TALE repeats.Nucl. Acids Res. 2013; 41: e83Crossref PubMed Scopus (28) Google Scholar) that are simpler in terms of generation and transfection but potentially less specific than TALENs that functionally rely on the interaction of two DNA binding domains. Expression of distinct gRNAs alongside Cas9 results in Cas9–gRNA complexes with distinct cleavage specificities (Cong et al., 2013Cong L. Ran F.A. Cox D. Lin S. Barretto R. Habib N. Hsu P.D. Wu X. Jiang W. Marraffini L.A Multiplex genome engineering using CRISPR/Cas systems.Science. 2013; 339: 819-823Crossref PubMed Scopus (10043) Google Scholar; Jiang et al., 2013Jiang W. Bikard D. Cox D. Zhang F. Marraffini L.A RNA-guided editing of bacterial genomes using CRISPR–Cas systems.Nat. Biotechnol. 2013; 31: 233-239Crossref PubMed Scopus (1672) Google Scholar; Mali et al., 2013Mali P. Yang L. Esvelt K.M. Aach J. Guell M. DiCarlo J.E. Norville J.E. Church G.M RNA-guided human genome engineering via Cas9.Science. 2013; 339: 823-826Crossref PubMed Scopus (6450) Google Scholar), thus facilitating multiplex genome editing. By contrast, targeting distinct loci with TALENs or ZFNs requires the design and synthesis of a new custom TALE or ZF array for each locus, which is more costly and time-consuming than for RGEN-based multiplex genome editing. However, a TALEN library covering predicted human genes has already been established (Kim et al., 2013Kim Y. Kweon J. Kim A. Chon J.K. Yoo J.Y. Kim H.J. Kim S. Lee C. Jeong E. Chung E. et al.A library of TAL effector nucleases spanning the human genome.Nat. Biotechnol. 2013; 31: 251-258Crossref PubMed Scopus (289) Google Scholar). Similar libraries may be available soon for other model species, thus possibly limiting the effort that is required to target multiple loci with TALENs. Also, custom TALE arrays can meanwhile be ordered from commercial suppliers and off-the-shelf libraries may be soon part of commercial programs, most likely reducing costs per TALEN and facilitating rapid delivery of desired TALE arrays. Ideally, a given PDP can be adapted to targets with desired nucleotide composition. However, RGENs, ZFs, and TALEs each have different limitations in their targeting capabilities. For example, the number of programmable bases in Cas9 is restricted to 20 and somewhat constrained by three invariable guanine bases that must flank the target site (GN19NGG; details are given in Figure 1). In ZF arrays that typically consist of three or four fingers, for some base triplets, corresponding fingers have not yet been discovered. Furthermore, ZFs targeting GNN triplets seem to be superior to other ZFs, somewhat limiting the targeting capabilities of ZF arrays. Custom TALE DNA binding domains typically consist of 16–24 repeats and thus this type of PDP should be superior in terms of target specificity. The N-terminus of Xanthomonas TALEs, however, has a strong preference for a 5’ thymine in its target sequence which somewhat restricts target flexibility. Yet, recent studies on TALEs from Ralstonia solanacearum uncovered an N-terminal domain with distinct specificity (de Lange et al., 2013de Lange O. Schreiber T. Schandry N. Radeck J. Braun K.H. Koszinowski J. Heuer H. Strauß A. Lahaye T Breaking the DNA binding code of Ralstonia solanacearum TAL effectors provides new possibilities to generate plant resistance genes against bacterial wilt disease.New Phytol. 2013; (In press)https://doi.org/10.1111/nph.12324Crossref PubMed Scopus (74) Google Scholar), thus further improving flexibility in TALE-based PDPs. A given PDP should bind specifically at one unique site within the genome of a given target cell. The demands on DNA binding specificity are obviously dictated by the genome size of the given target cell, but also by the type of application. For example, for therapeutic applications in humans (e.g. a designer nuclease cleaving viral DNA), a PDP should ideally show no off-target activity in any of the estimated 1 × 1013 cells of the human body, thus significantly raising the specificity demands for a PDP. Comparative studies on the specificity of ZFNs and TALENs suggest that the latter are superior in specificity (Mussolino et al., 2011Mussolino C. Morbitzer R. Lütge F. Dannemann N. Lahaye T. Cathomen T A novel TALE nuclease scaffold enables high genome editing activity in combination with low toxicity.Nucl. Acids Res. 2011; 39: 9283-9293Crossref PubMed Scopus (573) Google Scholar) and the scientific community impatiently awaits off-target studies for RGENs. However, if a PDP can be generated at low cost and high speed, this will in some cases be more important than its gene targeting accuracy. For example, off-site activity may be tolerable for fundamental research where Mendelian genetics can be used to clarify the causal relationships between phenotype and genotype and to segregate intended target modifications away from deleterious off-site modifications. In clinical and translational research, gene delivery remains a major bottleneck for PDPs and typically large gene constructs cause problems. Indeed, recent studies revealed that genes encoding TALENs are instable in lentiviral vectors (Holkers et al., 2013Holkers M. Maggio I. Liu J. Janssen J.M. Miselli F. Mussolino C. Recchia A. Cathomen T. Gonçalves M.A Differential integrity of TALE nuclease genes following adenoviral and lentiviral vector gene transfer into human cells.Nucl. Acids Res. 2013; 41: e63Crossref PubMed Scopus (232) Google Scholar), which likely is due to the repetitive nature and size of genes encoding TALE repeat arrays. A typical 17.5 repeat TALEN is about 105 kDa and is thus still substantially smaller than Cas9, which is about 160 kDa. By contrast, a typical four-finger ZFN is only about 40 kDa. Thus, it seems likely that the compact nature of ZFNs may be beneficial in certain experimental contexts. It is likely that the unique features arising from distinct PDPs will lead to differential use in distinct downstream applications. We anticipate that, in fundamental plant science, where Mendelian genetics is routinely used to correlate genotype and phenotype, Cas9-derived PDPs will have a major impact in the near future due to the simplicity of the construct generation. By contrast, in commercial crop-breeding programs, off-targeting will remain a major issue and consumers are likely to request ‘safe food’. In this context, TALE-derived PDPs, due to their previously demonstrated high specificity (Mussolino et al., 2011Mussolino C. Morbitzer R. Lütge F. Dannemann N. Lahaye T. Cathomen T A novel TALE nuclease scaffold enables high genome editing activity in combination with low toxicity.Nucl. Acids Res. 2011; 39: 9283-9293Crossref PubMed Scopus (573) Google Scholar), may be the preferred choice. In this way, both fundamental and applied research stands to benefit greatly from the rapid progress in the development of PDPs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call