Abstract
Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki.
Highlights
Introduction to RNA SequencingGene expression is a widely studied process and a major area of focus for functional genomics [1]
It has been reported that 85% of novel splicing events and 88% of differentially expressed exons predicted by RNA sequencing (RNA-seq) are validated by “gold-standard” approaches such as reverse transcription polymerase chain reaction (RT-PCR) and quantitative polymerase chain reaction (qPCR) [3]
To avoid repetition of effort, we advocate for these questions to be asked and answered within “BioStars”, an online question-and-answer forum for bioinformatics [90] in which a community can improve and update answers as RNA-seq analysis practices evolve
Summary
Gene expression is a widely studied process and a major area of focus for functional genomics [1]. We make available open-access tutorials that cover cloud computing for RNA-seq analysis, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality control, and complete pipelines for expression, differential expression, and alternative splicing analysis (Supplementary Tutorials online at www.rnaseq.wiki). We provide an extensive introduction to cloud-computing concepts and specific cloud administration skills in the Supplementary Tutorials online (www.rnaseq.wiki) For these tutorials and the following analysis discussions, we selected the “tuxedo” suite and other commonly used tools to illustrate an example RNA-seq analysis workflow. Quality trimming generally removes the ends of reads where base quality scores have dropped to a level such that sequence errors and the resulting mismatches prevent reads from aligning Tools such as skewer [57] and trimmomatic [58] bundle several algorithms together for adjusting raw RNA-seq data and assessing the quality of read data prior to alignment (Supplementary Tutorials at www.rnaseq.wiki). The “sashimi” plots [65] of IGV allow for the interpretation of complex RNA splicing patterns suggested by coverage patterns and junction spanning reads in an RNA-seq dataset
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.