Abstract

Alternative splicing (AS) is the process of combining different parts of the pre-mRNA to produce diverse transcripts and eventually different protein products from a single gene. In computational biology field, researchers try to understand AS behavior and regulation using computational models known as “Splicing Codes”. The final goal of these algorithms is to make an in-silico prediction of AS outcome from genomic sequence. Here, we develop a deep learning approach, called Deep Splicing Code (DSC), for categorizing the well-studied classes of AS namely alternatively skipped exons, alternative 5’ss, alternative 3’ss, and constitutively spliced exons based only on the sequence of the exon junctions. The proposed approach significantly improves the prediction and the obtained results reveal that constitutive exons have distinguishable local characteristics from alternatively spliced exons. Using the motif visualization technique, we show that the trained models learned to search for competitive alternative splice sites as well as motifs of important splicing factors with high precision. Thus, the proposed approach greatly expands the opportunities to improve alternative splicing modeling. In addition, a web-server for AS events prediction has been developed based on the proposed method.

Highlights

  • Alternative splicing is the key contributor to human transcriptome diversity by producing multiple messenger RNA from a single pre-mRNA

  • In order to distinguish between different types of human internal exons, we propose four classification models: constitutive exons vs. skipped exons (CON-exon skipping (ES)), constitutive exons vs. exons with used alternative 30 (CON-ALT3), constitutive exons vs. exons with used alternative 50 (CON-ALT5) and a general model to classify all the exons types called deep splicing code (DSC)

  • We developed a method for classification of the human internal exons according to their alternative splicing behavior using the local RNA sequences only

Read more

Summary

Introduction

Alternative splicing is the key contributor to human transcriptome diversity by producing multiple messenger RNA (mRNA) from a single pre-mRNA. In this process, the non-coding intronic sequences are removed and exons are connected in different combinations. The most common AS events are skipped exons (cassette exons), retained introns, and exons with an alternative 30 or 50 splice-site selection [1,2]. Genomic variation at the level of the sequence of the transcript can cause defects in splicing This misregulation contributes significantly to human disease [3,4] and multiple types of cancer [5]. Researchers built computational models, known as ‘splicing codes’, to predict AS outcome from DNA sequences alone and independently of existing studies, such as disease annotation and population data [6]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call