Abstract

A complete RNA-Seq analysis involves the use of several different tools, with substantial software and computational requirements. The Galaxy platform simplifies the execution of such bioinformatics analyses by embedding the needed tools in its web interface, while also providing reproducibility. Here, we describe how to perform a reference-based RNA-Seq analysis using Galaxy, from data upload to visualization and functional enrichment analysis of differentially expressed genes.

Highlights

  • In recent years, RNA sequencing has become a very widely used technology to analyze the continuously changing cellular transcriptome, that is, the set of all RNA molecules in one cell or a population of cells

  • One of the most common aims of RNA-Seq is the profiling of gene expression by identifying genes or molecular pathways that are differentially expressed (DE) between two or more biological conditions

  • This chapter is based on the “Reference-based RNA-Seq data analysis” tutorial, and we defer the reader to additional explanations there

Read more

Summary

Introduction

RNA sequencing (in short RNA-Seq) has become a very widely used technology to analyze the continuously changing cellular transcriptome, that is, the set of all RNA molecules in one cell or a population of cells. The computational workflow for the detection of DE genes and pathways from RNA-Seq data requires the use of several command-line tools and substantial computational resources that most users may not have access to. Steps in an analysis are executed by running Galaxy tools, which describe how to translate parameters for command-line software into a user-friendly web interface. The graphical web interface and a large amount of high-quality, community-developed and maintained tools and training materials enable rapid interactive analyses for novices and expert users alike. An important community-maintained resource is the Galaxy Training Material (available at https://training.galaxyproject.org) [2], which hosts a wide range of step-by-step hands-on tutorials for common bioinformatic analysis tasks. This chapter is based on the “Reference-based RNA-Seq data analysis” tutorial (https:// training.galaxyproject.org/topics/transcriptomics/tutorials/refbased/tutorial.html), and we defer the reader to additional explanations there. In this chapter we will use a selection of Galaxy tools to show step-by-step how to find differentially expressed genes, from data upload to functional enrichment analysis, using real experimental data

RNA-Seq Dataset
Upload FASTQ to Galaxy
Quality Control and Trimming
Mapping
Count the Number of Reads per Annotated
Identification of Differentially Expressed Genes
Import the seven count files from the same Shared Data library
Visualization
Functional Enrichment Analysis
Sharing the Results
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call