Abstract
BackgroundIt is generally thought that most canonical or non-canonical splicing events involving U2- and U12 spliceosomes occur within nuclear pre-mRNAs. However, the question of whether at least some U12-type splicing occurs in the cytoplasm is still unclear. In recent years next-generation sequencing technologies have revolutionized the field. The “Read-Split-Walk” (RSW) and “Read-Split-Run” (RSR) methods were developed to identify genome-wide non-canonical spliced regions including special events occurring in cytoplasm. As the significant amount of genome/transcriptome data such as, Encyclopedia of DNA Elements (ENCODE) project, have been generated, we have advanced a newer more memory-efficient version of the algorithm, “Read-Split-Fly” (RSF), which can detect non-canonical spliced regions with higher sensitivity and improved speed. The RSF algorithm also outputs the spliced sequences for further downstream biological function analysis.ResultsWe used open access ENCODE project RNA-Seq data to search spliced intron sequences against the U12-type spliced intron sequence database to examine whether some events could occur as potential signatures of U12-type splicing. The check was performed by searching spliced sequences against 5’ss and 3’ss sequences from the well-known orthologous U12-type spliceosomal intron database U12DB. Preliminary results of searching 70 ENCODE samples indicated that the presence of 5’ss with U12-type signature is more frequent than U2-type and prevalent in non-canonical junctions reported by RSF. The selected spliced sequences have also been further studied using miRBase to elucidate their functionality. Preliminary results from 70 samples of ENCODE datasets show that several miRNAs are prevalent in studied ENCODE samples. Two of these are associated with many diseases as suggested in the literature. Specifically, hsa-miR-1273 and hsa-miR-548 are associated with many diseases and cancers.ConclusionsOur RSF pipeline is able to detect many possible junctions (especially those with a high RPKM) with very high overall accuracy and relative high accuracy for novel junctions. We have incorporated useful parameter features into the pipeline such as, handling variable-length read data, and searching spliced sequences for splicing signatures and miRNA events. We suggest RSF, a tool for identifying novel splicing events, is applicable to study a range of diseases across biological systems under different experimental conditions.
Highlights
It is generally thought that most canonical or non-canonical splicing events involving U2- and U12 spliceosomes occur within nuclear pre-mRNAs
RSF pipeline The presence of novel isoforms by splicing independent of normal mRNA processing has previously been identified by the Read-Split-Walk (RSW) pipeline developed in 2014 [49] and Read-Split-Run (RSR) [50]
We developed an updated version of RSF: ReadSplit-Fly. This enhanced RSF has a newly developed pipeline with improved performance, sensitivity, and flexible parameter features. This pipeline has achieved a reasonable specificity (>60%) of novel junctions for the half of tested Encyclopedia of DNA Elements (ENCODE) samples and high specificity for detection of both known and novel junctions for 3⁄4 of tested ENCODE samples, with some samples having as high as 98% specificity (Fig. 1)
Summary
It is generally thought that most canonical or non-canonical splicing events involving U2- and U12 spliceosomes occur within nuclear pre-mRNAs. the question of whether at least some U12-type splicing occurs in the cytoplasm is still unclear. The AS process removes introns from the nuclear pre-mRNAs with the help of the spliceosome, which can recognize conserved short consensus sequences within the introns and at intron-exon boundaries. There are two types of identified spliceosome complex that catalyze the pre-mRNA splicing [5]. U2-type and U12-type spliceosomes have most of their protein components shared, seven protein components are unique and associated with the U11/U12 di-snRNP so that the U11/U12 disnRNP can recognize the branch point sequences and the 5′ splice sites of the U12-type introns [8]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.