Abstract

BackgroundRemarkable advances in Next Generation Sequencing (NGS) technologies, bioinformatics algorithms and computational technologies have significantly accelerated genomic research. However, complicated NGS data analysis still remains as a major bottleneck. RNA-seq, as one of the major area in the NGS field, also confronts great challenges in data analysis.ResultsTo address the challenges in RNA-seq data analysis, we developed a web portal that offers three integrated workflows that can perform end-to-end compute and analysis, including sequence quality control, read-mapping, transcriptome assembly, reconstruction and quantification, and differential analysis. The first workflow utilizes Tuxedo (Tophat, Cufflink, Cuffmerge and Cuffdiff suite of tools). The second workflow deploys Trinity for de novo assembly and uses RSEM for transcript quantification and EdgeR for differential analysis. The third combines STAR, RSEM, and EdgeR for data analysis. All these workflows support multiple samples and multiple groups of samples and perform differential analysis between groups in a single workflow job submission. The calculated results are available for download and post-analysis. The supported animal species include chicken, cow, duck, goat, pig, horse, rabbit, sheep, turkey, as well as several other model organisms including yeast, C. elegans, Drosophila, and human, with genomic sequences and annotations obtained from ENSEMBL.The RNA-seq portal is freely available from http://weizhongli-lab.org/RNA-seq.ConclusionsThe web portal offers not only bioinformatics software, workflows, computation and reference data, but also an integrated environment for complex RNA-seq data analysis for agricultural animal species. In this project, our aim is not to develop new RNA-seq tools, but to build web workflows for using popular existing RNA-seq methods and make these tools more accessible to the communities.

Highlights

  • Remarkable advances in Generation Sequencing (NGS) technologies, bioinformatics algorithms and computational technologies have significantly accelerated genomic research

  • We have developed a web portal offering integrated workflows that can perform end-to-end compute and analysis, including sequence (Quality Control) Quality control (QC), read-mapping, transcriptome assembly, reconstruction and quantification, and multiple analysis tools

  • The workflows in this project are configured with a lightweight workflow engine we developed in our earlier projects [36], supported by the Human Microbiome Project (HMP)

Read more

Summary

Introduction

Remarkable advances in Generation Sequencing (NGS) technologies, bioinformatics algorithms and computational technologies have significantly accelerated genomic research. Complicated NGS data analysis still remains as a major bottleneck. RNA-seq, as one of the major area in the NGS field, confronts great challenges in data analysis. Remarkable advances in Generation Sequencing (NGS) technologies [1] and computational theory and practice as well as rapid developments of bioinformatics algorithms in recent years have significantly accelerated genomic researches. Sequencing steady-state RNA in a biological sample (RNA-seq) [2, 3], as one of the major NGS approaches, has been widely used in many fields. RNA-seq overcomes many limitations of previous technologies, such as microarrays and real-time PCR. Many tools and methods have been developed for RNA-seq data analysis. Some major categories of these tools including read-mapping, transcriptome assembly or reconstruction, and expression quantification [4]

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call