Abstract

Small RNA RNA-seq for microRNAs (miRNAs) is a rapidly developing field where opportunities still exist to create better bioinformatics tools to process these large datasets and generate new, useful analyses. We built miRge to be a fast, smart small RNA-seq solution to process samples in a highly multiplexed fashion. miRge employs a Bayesian alignment approach, whereby reads are sequentially aligned against customized mature miRNA, hairpin miRNA, noncoding RNA and mRNA sequence libraries. miRNAs are summarized at the level of raw reads in addition to reads per million (RPM). Reads for all other RNA species (tRNA, rRNA, snoRNA, mRNA) are provided, which is useful for identifying potential contaminants and optimizing small RNA purification strategies. miRge was designed to optimally identify miRNA isomiRs and employs an entropy based statistical measurement to identify differential production of isomiRs. This allowed us to identify decreasing entropy in isomiRs as stem cells mature into retinal pigment epithelial cells. Conversely, we show that pancreatic tumor miRNAs have similar entropy to matched normal pancreatic tissues. In a head-to-head comparison with other miRNA analysis tools (miRExpress 2.0, sRNAbench, omiRAs, miRDeep2, Chimira, UEA small RNA Workbench), miRge was faster (4 to 32-fold) and was among the top-two methods in maximally aligning miRNAs reads per sample. Moreover, miRge has no inherent limits to its multiplexing. miRge was capable of simultaneously analyzing 100 small RNA-Seq samples in 52 minutes, providing an integrated analysis of miRNA expression across all samples. As miRge was designed for analysis of single as well as multiple samples, miRge is an ideal tool for high and low-throughput users. miRge is freely available at http://atlas.pathology.jhu.edu/baras/miRge.html.

Highlights

  • MicroRNAs are short (17–24 bp) RNA species that regulate translation across most species [1]

  • The popularity of RNA sequencing (RNA-seq) for miRNA profiling has risen as the cost of sequencing has decreased

  • For speed and alignment tests, we evaluated 103 short RNA-seq Illumina datasets obtained from the Sequence Read Archive (SRA)

Read more

Summary

Introduction

MicroRNAs (miRNAs) are short (17–24 bp) RNA species that regulate translation across most species [1]. RNA-seq is ideal as it allows the characterization of all known and unknown miRNAs, including isomiR forms, from a given RNA source. This advantage is tempered by the need for significantly more starting material than is necessary for qRT-PCR based approaches. A variety of RNA-seq computational tools exist, each with certain advantages and limitations, without consensus on an optimal method. This has created an opportunity for a new generation of fast and accurate tools to quantitate, annotate, and summarize the resulting data of each miRNA species from a sequencing run [3]. As more miRNA RNAseq data is reported, there has become a greater appreciation of isomiRs and the need to identify them in RNA-seq datasets [4]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call