Abstract

Plant-pathogenic Xanthomonas bacteria secrete transcription activator-like effectors (TALEs) into host cells, where they act as transcriptional activators on plant target genes to support bacterial virulence. TALEs have a unique modular DNA-binding domain composed of tandem repeats. Two amino acids within each tandem repeat, termed repeat-variable diresidues, bind to contiguous nucleotides on the DNA sequence and determine target specificity. In this paper, we propose a novel approach for TALE target prediction to identify potential virulence targets. Our approach accounts for recent findings concerning TALE targeting, including frame-shift binding by repeats of aberrant lengths, and the flexible strand orientation of target boxes relative to the transcription start of the downstream target gene. The computational model can account for dependencies between adjacent RVD positions. Model parameters are learned from the wealth of quantitative data that have been generated over the last years. We benchmark the novel approach, termed PrediTALE, using RNA-seq data after Xanthomonas infection in rice, and find an overall improvement of prediction performance compared with previous approaches. Using PrediTALE, we are able to predict several novel putative virulence targets. However, we also observe that no target genes are predicted by any prediction tool for several TALEs, which we term orphan TALEs for this reason. We postulate that one explanation for orphan TALEs are incomplete gene annotations and, hence, propose to replace promoterome-wide by genome-wide scans for target boxes. We demonstrate that known targets from promoterome-wide scans may be recovered by genome-wide scans, whereas the latter, combined with RNA-seq data, are able to detect putative targets independent of existing gene annotations.

Highlights

  • Pairs of TALE RVD sequence and tested target boxes were obtained from Fig. 1 and Fig. 2a of [22]

  • Data were grouped by TALE, and the global weight was computed as the maximum GUS activity for the current TALE divided by the maximum GUS activity reported for all TALEs with the same 13th AA at the varied positions stemming from the same experiment

  • Target values were computed as the GUS activity of the current pair of TALE and target box divided by maximum GUS activity over all tested target boxes for the current TALE

Read more

Summary

Introduction

Pairs of TALE RVD sequence and tested target boxes were obtained from Fig. 1 and Fig. 2a of [22]. Data were grouped by TALE, and the global weight was computed as the maximum “Normalized reporter activation” for the current TALE divided by the maximum “Normalized reporter activation” reported for all TALEs with the same 13th AA at the varied positions.

Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.