An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study

Zichen Wang,Avi Ma'Ayan

doi:10.12688/f1000research.9110.1

Zichen Wang, Avi Ma'Ayan

Open Access

https://doi.org/10.12688/f1000research.9110.1

Copy DOI

Abstract

RNA-seq analysis is becoming a standard method for global gene expression profiling. However, open and standard pipelines to perform RNA-seq analysis by non-experts remain challenging due to the large size of the raw data files and the hardware requirements for running the alignment step. Here we introduce a reproducible open source RNA-seq pipeline delivered as an IPython notebook and a Docker image. The pipeline uses state-of-the-art tools and can run on various platforms with minimal configuration overhead. The pipeline enables the extraction of knowledge from typical RNA-seq studies by generating interactive principal component analysis (PCA) and hierarchical clustering (HC) plots, performing enrichment analyses against over 90 gene set libraries, and obtaining lists of small molecules that are predicted to either mimic or reverse the observed changes in mRNA expression. We apply the pipeline to a recently published RNA-seq dataset collected from human neuronal progenitors infected with the Zika virus (ZIKV). In addition to confirming the presence of cell cycle genes among the genes that are downregulated by ZIKV, our analysis uncovers significant overlap with upregulated genes that when knocked out in mice induce defects in brain morphology. This result potentially points to the molecular processes associated with the microcephaly phenotype observed in newborns from pregnant mothers infected with the virus. In addition, our analysis predicts small molecules that can either mimic or reverse the expression changes induced by ZIKV. The IPython notebook and Docker image are freely available at: http://nbviewer.jupyter.org/github/maayanlab/Zika-RNAseq-Pipeline/blob/master/Zika.ipynband https://hub.docker.com/r/maayanlab/zika/.

Highlights

The increase in awareness about the irreproducibility of scientific research requires the development of methods that make experimental and computational protocols repeatable and transparent[1]
Here we present an interactive IPython notebook that serves as a tutorial for performing a standard RNA-seq pipeline
The following step is to identify the differentially expressed genes (DEG) between the two conditions. This is achieved with a unique method we developed called the Characteristic Direction (CD)[9]

Summary

Introduction

The increase in awareness about the irreproducibility of scientific research requires the development of methods that make experimental and computational protocols repeatable and transparent[1]. The advent of interactive notebooks for data analysis pipelines significantly enhances the recording and sharing of data, source code, and figures[2]. In a subset of recent publications, an interactive notebook was published alongside customary manuscripts[3]. Here we present an interactive IPython notebook (http://nbviewer.jupyter.org/github/maayanlab/ZikaRNAseq-Pipeline/blob/master/Zika.ipynb) that serves as a tutorial for performing a standard RNA-seq pipeline. We applied the pipeline to RNA-seq data from a recent publication where human induced pluripotent stem cells were differentiated to neuronal progenitors and infected with Zika virus (ZIKV)[4]. The aim of the study was to begin to understand the molecular mechanisms that induce the observed devastating phenotype of newborn-microcephaly from pregnant mothers infected with the virus

Objectives

Methods

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: F1000Research	Publication Date: Jul 5, 2016
Citations: 33	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: F1000Research

Lead the way for us

Similar Papers

An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study
Avi Ma'Ayan ... Fredrik Pettersson
F1000Research | VOL. 5
Avi Ma'Ayan, et. al.Avi Ma'Ayan ... Fredrik Pettersson
21 Jul 2016
F1000Research | VOL. 5

Editor's evaluation: Comparative transcriptomic analysis reveals translationally relevant processes in mouse models of malaria
Urszula Krzych
-
Urszula KrzychUrszula Krzych
11 Aug 2021
11 Aug 2021

American-Asian- and African lineages of Zika virus induce differential pro-inflammatory and Interleukin 27-dependent antiviral responses in human monocytes
Lady Johana Hernández-Sarmiento ... Silvio Urcuqui-Inchima
Virus Research | VOL. 325
Lady Johana Hernández-Sarmiento, et. al.Lady Johana Hernández-Sarmiento ... Silvio Urcuqui-Inchima
05 Jan 2023
Virus Research | VOL. 325

Discovery of Novel Markers of Virus Transmission by Mosquitoes

-

03 Jan 2020
03 Jan 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: F1000Research