Abstract

The ability to easily and efficiently analyse RNA-sequencing data is a key strength of the Bioconductor project. Starting with counts summarised at the gene-level, a typical analysis involves pre-processing, exploratory data analysis, differential expression testing and pathway analysis with the results obtained informing future experiments and validation studies. In this workflow article, we analyse RNA-sequencing data from the mouse mammary gland, demonstrating use of the popular edgeR package to import, organise, filter and normalise the data, followed by the limma package with its voom method, linear modelling and empirical Bayes moderation to assess differential expression and perform gene set testing. This pipeline is further enhanced by the Glimma package which enables interactive exploration of the results so that individual samples and genes can be examined by the user. The complete analysis offered by these three packages highlights the ease with which researchers can turn the raw counts from an RNA-sequencing experiment into biological insights using Bioconductor.

Highlights

  • RNA-sequencing (RNA-seq) has become the primary technology used for gene expression profiling, with the genomewide detection of differentially expressed genes between two or more conditions of interest one of the most commonly asked questions by researchers

  • We describe an edgeR - limma workflow for analysing RNA-seq data that takes gene-level counts as its input, and moves through pre-processing and exploratory data analysis before obtaining lists of differentially expressed (DE) genes and gene signatures

  • Reads were aligned to the mouse reference genome using the R based pipeline available in the Rsubread package. Count data for these samples can be downloaded from the Gene Expression Omnibus (GEO) using GEO Series accession number GSE63310

Read more

Summary

17 Jun 2016 report report report

3. Jovana Maksimovic , Murdoch Children's Research Institute, Parkville, Australia. This article is included in the International Society for Computational Biology Community. This article is included in the Bioconductor gateway. The article has been updated with minor edits to the R code for simplification and minor text changes to clarify any ambiguous wording. Extra information has been added to the section on “Gene set testing with camera” to briefly describe c2 and hallmark gene sets and to explain how to choose between camera and mroast, another gene set testing method available in limma. Author contributions have been specified in the update

Introduction
## $design
Bioconductor Core Team
16. Smyth GK
Findings
22. R Development Core Team
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call