Abstract

BackgroundmiRNAs play a key role in normal physiology and various diseases. miRNA profiling through next generation sequencing (miRNA-seq) has become the main platform for biological research and biomarker discovery. However, analyzing miRNA sequencing data is challenging as it needs significant amount of computational resources and bioinformatics expertise. Several web based analytical tools have been developed but they are limited to processing one or a pair of samples at time and are not suitable for a large scale study. Lack of flexibility and reliability of these web applications are also common issues.ResultsWe developed a Comprehensive Analysis Pipeline for microRNA Sequencing data (CAP-miRSeq) that integrates read pre-processing, alignment, mature/precursor/novel miRNA detection and quantification, data visualization, variant detection in miRNA coding region, and more flexible differential expression analysis between experimental conditions. According to computational infrastructure, users can install the package locally or deploy it in Amazon Cloud to run samples sequentially or in parallel for a large number of samples for speedy analyses. In either case, summary and expression reports for all samples are generated for easier quality assessment and downstream analyses. Using well characterized data, we demonstrated the pipeline’s superior performances, flexibility, and practical use in research and biomarker discovery.ConclusionsCAP-miRSeq is a powerful and flexible tool for users to process and analyze miRNA-seq data scalable from a few to hundreds of samples. The results are presented in the convenient way for investigators or analysts to conduct further investigation and discovery.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2164-15-423) contains supplementary material, which is available to authorized users.

Highlights

  • MiRNAs play a key role in normal physiology and various diseases. miRNA profiling through generation sequencing has become the main platform for biological research and biomarker discovery

  • Hybridization based microarray technology has been used for miRNA profiling; this technology is hindered by its narrow detection range, higher susceptibility to technical variation [2], and lack of ability to detect novel miRNAs and structural sequence changes. miRNA profiling through generation sequencing overcomes the limitations and has become increasingly

  • Pipeline performance CAP-miRSeq is mainly developed for a cluster environment to parallelize multiple jobs for faster processing so the run time is roughly the time needed for a single sample to complete the whole pipeline, plus the time such as to merge multiple samples and create summary reports

Read more

Summary

Introduction

MiRNAs play a key role in normal physiology and various diseases. miRNA profiling through generation sequencing (miRNA-seq) has become the main platform for biological research and biomarker discovery. OmiRas [7] is another recent web application for users to upload multiple raw sequence data with differential expression analysis by DESeq [8] between two sample groups. The common issues with the web-based tools are lack of flexibility (parameter options, outdated reference genome or miRNA annotations), reliability (server down or not functional at all), and control of sensitive patient data. Most of these tools can only process one sample at time or have a data upload limit or require pre-processed data beforehand as input. None of the tools detect single nucleotide variants (SNVs)/mutations in the coding region of miRNAs, which is increasingly important as it may affect miRNA binding on multiple targets [9,10,11]

Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.