Abstract
High-throughput CRISPR-Cas9 knockout screens using a tiling-sgRNA design permit in situ evaluation of protein domain function. Here, to facilitate de novo identification of essential protein domains from such screens, we propose ProTiler, a computational method for the robust mapping of CRISPR knockout hyper-sensitive (CKHS) regions, which refer to the protein regions associated with a strong sgRNA dropout effect in the screens. Applied to a published CRISPR tiling screen dataset, ProTiler identifies 175 CKHS regions in 83 proteins. Of these CKHS regions, more than 80% overlap with annotated Pfam domains, including all of the 15 known drug targets in the dataset. ProTiler also reveals unannotated essential domains, including the N-terminus of the SWI/SNF subunit SMARCB1, which is validated experimentally. Surprisingly, the CKHS regions are negatively correlated with phosphorylation and acetylation sites, suggesting that protein domains and post-translational modification sites have distinct sensitivities to CRISPR-Cas9 mediated amino acids loss.
Highlights
In a pooled high-throughput CRISPR-Cas[9] knockout screen, an sgRNA library contains tens of thousands of sgRNAs
The red line shows the segmented protein regions and their dropout signal levels. e A structural model of condensin complex, in which SMC2 and SMC4 form a heterodimer via hinge domains, and their ATPase head domains (N and C) are associated with kleisin subunits to create a ring-like structure. f Categorization of CRISPR knockout hyper-sensitive (CKHS) regions based on the molecular functions of overlapped protein domains. g A bar chart showing the proportion of amino acid (AA) in Pfam domains, for CKHS regions and non-CKHS regions respectively
When we mapped the post-translational modification (PTM) sites onto our data, we found phosphorylation and acetylation sites were significantly depleted inside CKHS regions compared to outside CKHS regions (Fig. 4c)
Summary
In a pooled high-throughput CRISPR-Cas[9] knockout screen, an sgRNA library contains tens of thousands of sgRNAs. A computational pipeline, CRISPRO, maps functional scores of tiling sgRNAs to genomes, transcripts, protein coordinates and structures, providing general views of structure-function relationships at discrete protein regions[8]. Despite these advances, pooled high-throughput CRISPRCas[9] screens are subject to inactive sgRNAs, off-target effects, and high noise-to-signal ratios, posing computational challenges to the robust identification of essential domains. To address these challenges, we propose ProTiler, a computational method designed for the analysis of tiling CRISPR screen data. The p-value was empirically computed by random simulation. h Distribution of distances between the borders of CKHS regions and domain boundaries as defined in the Pfam database
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.