Abstract

BackgroundSplicing is a genetic process that has important implications in several diseases including cancer. Deciphering the complex rules of splicing regulation is crucial to understand and treat splicing-related diseases. Splicing factors and other RNA-binding proteins (RBPs) play a key role in the regulation of splicing. The specific binding sites of an RBP can be measured using CLIP experiments. However, to unveil which RBPs regulate a condition, it is necessary to have a priori hypotheses, as a single CLIP experiment targets a single protein.ResultsIn this work, we present a novel methodology to predict context-specific splicing factors from transcriptomic data. For this, we systematically collect, integrate and analyze more than 900 CLIP experiments stored in four CLIP databases: POSTAR2, CLIPdb, DoRiNA and StarBase. The analysis of these experiments shows the strong coherence between the binding sites of RBPs of similar families. Augmenting this information with expression changes, we are able to correctly predict the splicing factors that regulate splicing in two gold-standard experiments in which specific splicing factors are knocked-down.ConclusionsThe methodology presented in this study allows the prediction of active splicing factors in either cancer or any other condition by only using the information of transcript expression. This approach opens a wide range of possible studies to understand the splicing regulation of different conditions. A tutorial with the source code and databases is available at https://gitlab.com/fcarazo.m/sfprediction.

Highlights

  • Splicing is a genetic process that has important implications in several diseases including cancer

  • A unified database of human and mouse cross-linking and immunoprecipitation (CLIP) experiments We downloaded and integrated the CLIP experiments contained in POSTAR2, CLIPdb, DoRiNA and StarBase databases as described in the Methods section

  • Five of these experiments were discarded from the analysis because the RNAbinding protein (RBP) under study were mutated

Read more

Summary

Introduction

Splicing is a genetic process that has important implications in several diseases including cancer. Splicing factors and other RNA-binding proteins (RBPs) play a key role in the regulation of splicing. The specific binding sites of an RBP can be measured using CLIP experiments. To unveil which RBPs regulate a condition, it is necessary to have a priori hypotheses, as a single CLIP experiment targets a single protein. The expansive diversity of the transcriptome – induced by pre-mRNA splicing-plays a key role in the development of a broad spectrum of human diseases [1,2,3]. RBPs’ binding motifs are usually represented by position weighted matrices (PWMs) that provide the probability of having a specific nucleotide in each motif’s position. The weakest step of this pipeline is the identification of the specific binding sites for the RBPs. PWMs are usually short

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call