Abstract

BackgroundComputational tools analyzing RNA-sequencing data have boosted alternative splicing research by identifying and assessing differentially spliced genes. However, common alternative splicing analysis tools differ substantially in their statistical analyses and general performance. This report compares the computational performance (CPU utilization and RAM usage) of three event-level splicing tools; rMATS, MISO, and SUPPA2. Additionally, concordance between tool outputs was investigated.ResultsLog-linear relations were found between job times and dataset size in all splicing tools and all virtual machine (VM) configurations. MISO had the highest job times for all analyses, irrespective of VM size, while MISO analyses also exceeded maximum CPU utilization on all VM sizes. rMATS and SUPPA2 load averages were relatively low in both size and replicate comparisons, not nearing maximum CPU utilization in the VM simulating the lowest computational power (D2 VM). RAM usage in rMATS and SUPPA2 did not exceed 20% of maximum RAM in both size and replicate comparisons while MISO reached maximum RAM usage in D2 VM analyses for input size. Correlation coefficients of differential splicing analyses showed high correlation (β > 80%) between different tool outputs with the exception of comparisons of retained intron (RI) events between rMATS/MISO and rMATS/SUPPA2 (β < 60%).ConclusionsPrior to RNA-seq analyses, users should consider job time, amount of replicates and splice event type of interest to determine the optimal alternative splicing tool. In general, rMATS is superior to both MISO and SUPPA2 in computational performance. Analysis outputs show high concordance between tools, with the exception of RI events.

Highlights

  • Computational tools analyzing RNA-sequencing data have boosted alternative splicing research by identifying and assessing differentially spliced genes

  • Regardless of virtual machine (VM) type used, MISO required the longest time to perform the job for each size

  • Job times for SUPPA2 were consistent regardless of file size, since Percent spliced in (PSI) calculations were performed on transcript expression files, which are identical for each analysis

Read more

Summary

Introduction

Computational tools analyzing RNA-sequencing data have boosted alternative splicing research by identifying and assessing differentially spliced genes. AS plays a part in many biological processes involved in normal cellular functions such as homeostasis, differentiation or sex determination [2, 6, 7], and in disease pathogenesis and pharmacological processes underlying drug resistance [3, 8, 9]. With respect to the latter, AS has been proposed as a source of potential biomarkers or as a target for drug development [10,11,12]. In childhood acute lymphoblastic leukemia (ALL), an A5SS selection in exon 8 of the folate metabolizing enzyme folylpolyglutamate synthetase (FPGS) was shown to be associated with the clinical response to methotrexate (MTX), an anchor drug in the treatment of ALL [13] and rheumatoid arthritis [14]

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.