Abstract
RNA-Seq techniques generate hundreds of millions of short RNA reads using next-generation sequencing (NGS). These RNA reads can be mapped to reference genomes to investigate changes of gene expression but improved procedures for mining large RNA-Seq datasets to extract valuable biological knowledge are needed. RNAMiner—a multi-level bioinformatics protocol and pipeline—has been developed for such datasets. It includes five steps: Mapping RNA-Seq reads to a reference genome, calculating gene expression values, identifying differentially expressed genes, predicting gene functions, and constructing gene regulatory networks. To demonstrate its utility, we applied RNAMiner to datasets generated from Human, Mouse, Arabidopsis thaliana, and Drosophila melanogaster cells, and successfully identified differentially expressed genes, clustered them into cohesive functional groups, and constructed novel gene regulatory networks. The RNAMiner web service is available at http://calla.rnet.missouri.edu/rnaminer/index.html.
Highlights
Transcriptome analysis is essential for determining the relationship between the information encoded in a genome, its expression, and phenotypic variation [1,2]
Mapping RNA-Seq reads to a reference genome
The RNAMiner protocol and pipeline can progressively reduce the size of large datasets to produce valuable and comprehensible biological knowledge of manageable size, ranging from gene expression values, differentially expressed genes, gene function predictions, and gene regulatory networks
Summary
Transcriptome analysis is essential for determining the relationship between the information encoded in a genome, its expression, and phenotypic variation [1,2]. Next-generation sequencing (NGS) of RNAs (RNA-Seq) has emerged as a powerful approach for transcriptome analysis [3,4] that has many advantages over microarray technologies [5,6,7]. A RNA-Seq experiment typically generates hundreds of millions of short reads that are mapped to reference genomes and counted as a measure of expression [5]. Mining the gigabytes or even terabytes of RNA-Seq raw data is an essential, but challenging step in the analysis.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.