Abstract

BackgroundGenome-wide miRNA expression data can be used to study miRNA dysregulation comprehensively. Although many open-source tools for microRNA (miRNA)-seq data analyses are available, challenges remain in accurate miRNA quantification from large-scale miRNA-seq dataset. We implemented a pipeline called QuickMIRSeq for accurate quantification of known miRNAs and miRNA isoforms (isomiRs) from multiple samples simultaneously.ResultsQuickMIRSeq considers the unique nature of miRNAs and combines many important features into its implementation. First, it takes advantage of high redundancy of miRNA reads and introduces joint mapping of multiple samples to reduce computational time. Second, it incorporates the strand information in the alignment step for more accurate quantification. Third, reads potentially arising from background noise are filtered out to improve the reliability of miRNA detection. Fourth, sequences aligned to miRNAs with mismatches are remapped to a reference genome to further reduce false positives. Finally, QuickMIRSeq generates a rich set of QC metrics and publication-ready plots.ConclusionsThe rich visualization features implemented allow end users to interactively explore the results and gain more insights into miRNA-seq data analyses. The high degree of automation and interactivity in QuickMIRSeq leads to a substantial reduction in the time and effort required for miRNA-seq data analysis.

Highlights

  • Genome-wide miRNA expression data can be used to study miRNA dysregulation comprehensively

  • Recent additional studies have shown that MiRNA isoforms (isomiRs) sequences are tissue and gender-specific [34] and play distinct roles in biological processes [37], which emphasize the importance of performing miRNA-seq analysis simultaneously at both the miRNA and isomiR levels

  • The complete project reports can be downloaded from the QuickMIRSeq project home page

Read more

Summary

Results

QuickMIRSeq can analyze miRNA-seq datasets from any species as long as the corresponding mature miRNA and hairpin databases are available. When we analyzed an in-house cell-free miRNA-seq dataset from urine, we found some samples had exceptionally high redundancy in unaligned reads (unpublished data). The read length distributions for samples SRR1759212, SRR1759213, SRR1759214, and SRR1759215 from GSE64977 are shown in Additional file 1: Figure S7. The difference mainly results from the fact that miRge ignores strand information when analyzing miRNA datasets and that its execution workflow tends to exclude reads with mismatches from quantification, as discussed further in Additional file 1: Figure S9. Bcbio-nextgen implements a configurable bestpractices pipeline for small RNA-seq data analysis (https://bcbio-nextgen.readthedocs.io/en/latest/contents/ pipelines.html#smallrna-seq), including quality controls, adapter trimming, miRNA/isomiR quantification, other small RNA detection, and prediction of new miRNAs. The quantification of known small RNAs is carried out by SeqBuster [45], a bioinformatic tool developed in 2010, while the quantification isomiRs is done by R script. QuickMIRSeq makes all analysis results fully accessible via a web interface, and enables end users to visualize them interactively

Conclusions
Background
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.