Abstract

High-throughput CRISPR-Cas9 knockout screens using a tiling-sgRNA design permit in situ evaluation of protein domain function. Here, to facilitate de novo identification of essential protein domains from such screens, we propose ProTiler, a computational method for the robust mapping of CRISPR knockout hyper-sensitive (CKHS) regions, which refer to the protein regions associated with a strong sgRNA dropout effect in the screens. Applied to a published CRISPR tiling screen dataset, ProTiler identifies 175 CKHS regions in 83 proteins. Of these CKHS regions, more than 80% overlap with annotated Pfam domains, including all of the 15 known drug targets in the dataset. ProTiler also reveals unannotated essential domains, including the N-terminus of the SWI/SNF subunit SMARCB1, which is validated experimentally. Surprisingly, the CKHS regions are negatively correlated with phosphorylation and acetylation sites, suggesting that protein domains and post-translational modification sites have distinct sensitivities to CRISPR-Cas9 mediated amino acids loss.

Highlights

  • In a pooled high-throughput CRISPR-Cas[9] knockout screen, an sgRNA library contains tens of thousands of sgRNAs

  • The red line shows the segmented protein regions and their dropout signal levels. e A structural model of condensin complex, in which SMC2 and SMC4 form a heterodimer via hinge domains, and their ATPase head domains (N and C) are associated with kleisin subunits to create a ring-like structure. f Categorization of CRISPR knockout hyper-sensitive (CKHS) regions based on the molecular functions of overlapped protein domains. g A bar chart showing the proportion of amino acid (AA) in Pfam domains, for CKHS regions and non-CKHS regions respectively

  • When we mapped the post-translational modification (PTM) sites onto our data, we found phosphorylation and acetylation sites were significantly depleted inside CKHS regions compared to outside CKHS regions (Fig. 4c)

Read more

Summary

Introduction

In a pooled high-throughput CRISPR-Cas[9] knockout screen, an sgRNA library contains tens of thousands of sgRNAs. A computational pipeline, CRISPRO, maps functional scores of tiling sgRNAs to genomes, transcripts, protein coordinates and structures, providing general views of structure-function relationships at discrete protein regions[8]. Despite these advances, pooled high-throughput CRISPRCas[9] screens are subject to inactive sgRNAs, off-target effects, and high noise-to-signal ratios, posing computational challenges to the robust identification of essential domains. To address these challenges, we propose ProTiler, a computational method designed for the analysis of tiling CRISPR screen data. The p-value was empirically computed by random simulation. h Distribution of distances between the borders of CKHS regions and domain boundaries as defined in the Pfam database

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call