Mammalian internal exons are usually between 50 and 200 bp long. Their average length is ~150 bp, and only 5% of them are larger than 300 bp. Although previous studies showed that splice site recognition becomes less efficient with increasing exon size, the mechanism how large exons are correctly spliced is not well understood. To investigate the role of RNA binding proteins (RBPs) in the splicing of large exons, we analyzed 842 publicly available RNA‐seq datasets from GEO, in which 48 canonical splicing factors were individually knocked down. MISO analysis was performed to detect exons that were specifically included or excluded by knockdown of an RBP, and average lengths of these exons were calculated. Our analysis revealed that SRSF3 is the top RBP that promotes inclusion of the largest exon set (390 exons per analysis, 337 bp on average).Interestingly, most of the SRSF3‐dependent large exons are annotated as constitutive exons in the public database. Consistently, these exons have high splice site strengths. Motif analysis revealed that the SRSF3‐dependent large exons are greatly enriched with C‐rich sequences, which are preferentially recognized by SRSF3. In addition, our analysis also detected the enrichment of C‐stretches, which are potential binding sites for hnRNP K. ENCODE eCLIP shows that HNRNPK extensively binds to the SRSF3‐dependent large exons. To examine whether hnRNP K is involved in the regulation of SRSF3‐dependent splicing of large exons, we knocked down Hnrnpk along with Srsf3‐knockdown using C2C12 myoblasts. RNA‐seq analysis revealed that Srsf3‐silencing alone induced skipping of 374 exons, of which average length is 395 bp. Co‐silencing of Srsf3 and Hnrnpk rescued skipping of 142 exons by Srsf3‐silencing, of which average length is still long, 458 bp. These results indicate that substantial number of large exons require SRSF3 to prevent their skipping by hnRNP K.To investigate the molecules associated with SRSF3 and hnRNP K in the splicing of large exons, we next conducted proteomic approach that defines the native protein complexes in the chromatin fraction. Cells expressing FLAG tagged SRSF3 or hnRNP K were lysed under the physiological salt concentration, and isolated chromatin fractions were immunoprecipitated with At‐FLAG antibody. Mass spectrometry analysis of the immunoprecipitants revealed that SRSF3 is mainly associated with U1 snRNPs, U2AFs, and SF1, and to a lesser extent with hnRNPs, while hnRNP K is preferentially associated with hnRNPs and U1 snRNPs, but not with U2AF or SF1. Thus, SRSF3 and hnRNP K are involved in early but distinct stages of spliceosome assembly. Together, our analysis identified the antagonistic splicing regulation of SRSF3 and HNRNPK on C‐rich large exons.
Read full abstract