Abstract

Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. Although this code is used successfully to design TALEs to target specific sequences, off-target binding has been observed and is difficult to predict. Here we explore TALE–DNA interactions comprehensively by quantitatively assaying the DNA-binding specificities of 21 representative TALEs to ∼5,000–20,000 unique DNA sequences per protein using custom-designed protein-binding microarrays (PBMs). We find that protein context features exert significant influences on binding. Thus, the canonical recognition code does not fully capture the complexity of TALE–DNA binding. We used the PBM data to develop a computational model, Specificity Inference For TAL-Effector Design (SIFTED), to predict the DNA-binding specificity of any TALE. We provide SIFTED as a publicly available web tool that predicts potential genomic off-target sites for improved TALE design.

Highlights

  • Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting

  • As our results suggested that interactions between repeats modulate their individual repeat variable diresidues (RVDs) specificities, we modelled the protein-binding microarrays (PBMs) data to predict TALE specificity considering the context of each repeat in a TALE protein (Fig. 1c)

  • To assess whether Specificity Inference For TAL-Effector Design (SIFTED) can predict genomic off-target sites for TALE proteins that have not been assayed by PBMs, we examined a data set of in vivo TALE reporter activity[22]

Read more

Summary

Introduction

Transcription activator-like effector (TALE) proteins recognize DNA using a seemingly simple DNA-binding code, which makes them attractive for use in genome engineering technologies that require precise targeting. The amino acids at the RVD positions determine which DNA base is preferred, and each repeat in the TALE contacts one base in the target site. This led to a simple one-to-one ‘TALE code’ that uniquely predicts the optimal DNA target from the sequence of RVDs within the repeat array[9,10]. These findings suggest that TALE–DNA-binding specificity may be more complex than previously thought, but these effects have yet to be assayed comprehensively and quantitatively

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call