Abstract
RNA-Seq is increasingly being used to measure human RNA expression on a genome-wide scale. Expression profiles can be interrogated to identify and functionally characterize treatment-responsive genes. Ultimately, such controlled studies promise to reveal insights into molecular mechanisms of treatment effects, identify biomarkers, and realize personalized medicine. RNA-Seq Reports (RSEQREP) is a new open-source cloud-enabled framework that allows users to execute start-to-end gene-level RNA-Seq analysis on a preconfigured RSEQREP Amazon Virtual Machine Image (AMI) hosted by AWS or on their own Ubuntu Linux machine via a Docker container or installation script. The framework works with unstranded, stranded, and paired-end sequence FASTQ files stored locally, on Amazon Simple Storage Service (S3), or at the Sequence Read Archive (SRA). RSEQREP automatically executes a series of customizable steps including reference alignment, CRAM compression, reference alignment QC, data normalization, multivariate data visualization, identification of differentially expressed genes, heatmaps, co-expressed gene clusters, enriched pathways, and a series of custom visualizations. The framework outputs a file collection that includes a dynamically generated PDF report using R, knitr, and LaTeX, as well as publication-ready table and figure files. A user-friendly configuration file handles sample metadata entry, processing, analysis, and reporting options. The configuration supports time series RNA-Seq experimental designs with at least one pre- and one post-treatment sample for each subject, as well as multiple treatment groups and specimen types. All RSEQREP analyses components are built using open-source R code and R/Bioconductor packages allowing for further customization. As a use case, we provide RSEQREP results for a trivalent influenza vaccine (TIV) RNA-Seq study that collected 1 pre-TIV and 10 post-TIV vaccination samples (days 1-10) for 5 subjects and two specimen types (peripheral blood mononuclear cells and B-cells).
Highlights
The advent of next-generation sequencing (NGS) technologies has dramatically reduced costs and democratized sequencing[1]
We found that a c3.xlarge computational Elastic Compute Cloud (EC2) instance type (4 vCPUs, 7.5 GiB, https://aws.amazon.com/ ec2/instance-types) is sufficient for data processing and analysis, but a higher memory machine (c3.4xlarge: 16 Gib for HISAT2 and c3.8xlarge: 37 Gib for STAR) is required to successfully complete the indexing of the reference genome sequence as part of Step 1
Using the default RNA sequencing (RNA-Seq) Reports (RSEQREP) Amazon Virtual Machine Image (AMI), in addition to on-demand scalable computational resources, has the benefit of integrating the operating system and all software installations as part of analysis snapshots referenced in the report, providing for complete transparency and full reproducibility of all components involved
Summary
Any reports and responses or comments on the article can be found at the end of the article. We provide RSEQREP results for a trivalent influenza vaccine (TIV) RNASeq study that collected 1 pre-TIV and 10 post-TIV vaccination samples (days 1-10) for 5 subjects and two specimen types (peripheral blood mononuclear cells and B-cells). Keywords RSEQREP, RNA-Seq, transcriptomics, differential gene expression, pathway enrichment, reproducible research, cloud computing, trivalent influenza vaccine. This article is included in the RPackage gateway. This article is included in the International Society for Computational Biology Community Journal gateway
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.