SePIA: RNA and small RNA sequence processing, integration, and analysis.

Katherine Icay,Ville Rantanen,Alejandra Cervera,Ping Chen,Sampsa Hautaniemi,Rainer Lehtonen

doi:10.1186/s13040-016-0099-z

Abstract

BackgroundLarge-scale sequencing experiments are complex and require a wide spectrum of computational tools to extract and interpret relevant biological information. This is especially true in projects where individual processing and integrated analysis of both small RNA and complementary RNA data is needed. Such studies would benefit from a computational workflow that is easy to implement and standardizes the processing and analysis of both sequenced data types.ResultsWe developed SePIA (Sequence Processing, Integration, and Analysis), a comprehensive small RNA and RNA workflow. It provides ready execution for over 20 commonly known RNA-seq tools on top of an established workflow engine and provides dynamic pipeline architecture to manage, individually analyze, and integrate both small RNA and RNA data. Implementation with Docker makes SePIA portable and easy to run. We demonstrate the workflow’s extensive utility with two case studies involving three breast cancer datasets. SePIA is straightforward to configure and organizes results into a perusable HTML report. Furthermore, the underlying pipeline engine supports computational resource management for optimal performance.ConclusionSePIA is an open-source workflow introducing standardized processing and analysis of RNA and small RNA data. SePIA’s modular design enables robust customization to a given experiment while maintaining overall workflow structure. It is available at http://anduril.org/sepia.Electronic supplementary materialThe online version of this article (doi:10.1186/s13040-016-0099-z) contains supplementary material, which is available to authorized users.

Highlights

Large-scale sequencing experiments are complex and require a wide spectrum of computational tools to extract and interpret relevant biological information
The second and third datasets comprised of Level 1 data of 144 poly(A)-extracted mRNA samples (129 tumor, 15 normal breast tissue) and 149 miRNA samples (133 tumor, 16 normal breast tissue) downloaded from The Cancer Genome Atlas consortium [32]
We organize the datasets into two case studies: the first to showcase SePIA’s utility for transcript-level sequence analysis; the second to demonstrate integration of mRNA and small RNA data

Summary

Introduction

Large-scale sequencing experiments are complex and require a wide spectrum of computational tools to extract and interpret relevant biological information This is especially true in projects where individual processing and integrated analysis of both small RNA and complementary RNA data is needed. Such studies would benefit from a computational workflow that is easy to implement and standardizes the processing and analysis of both sequenced data types. Strategies have been developed to computationally identify and interpret biological information from different RNA-seq data types [2,3,4] These strategies are generally limited to a single data type or integration alone, with a set number of tools and little to no support for extensibility. A solution to this issue is a modular computational platform that allows testing, development, and easy replacement of methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BioData Mining	Publication Date: May 20, 2016
Citations: 90	License type: cc-by

R Discovery Prime

R Discovery Prime

SePIA: RNA and small RNA sequence processing, integration, and analysis.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BioData Mining

Lead the way for us

Similar Papers

Integrative Analysis of Small RNA and mRNA Expression Profiles Identifies Signatures Associated With Chronic Epididymitis.
Jialei Gong ... Yanfeng Li
Frontiers in Immunology | VOL. 13
Jialei Gong, et. al.Jialei Gong ... Yanfeng Li
11 May 2022
Frontiers in Immunology | VOL. 13

Small RNAs hit the big time
Iain R Searle ... Charles W Melnyk
New Phytologist | VOL. 174
Iain R Searle, et. al.Iain R Searle ... Charles W Melnyk
17 Apr 2007
New Phytologist | VOL. 174

IMir: an integrated pipeline for high-throughput analysis of small non-coding RNA data obtained by smallRNA-Seq.
Giorgio Giurato ... Maria Ravo
BMC Bioinformatics | VOL. 14
Giorgio Giurato, et. al.Giorgio Giurato ... Maria Ravo
01 Dec 2013
BMC Bioinformatics | VOL. 14

PiRNN: deep learning algorithm for piRNA prediction
Kai Wang ... Joshua Hoeksema
PeerJ | VOL. 6
Kai Wang, et. al.Kai Wang ... Joshua Hoeksema
03 Aug 2018
PeerJ | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SePIA: RNA and small RNA sequence processing, integration, and analysis.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BioData Mining