Abstract

BackgroundNext generation sequencing has allowed the discovery of miRNA isoforms, termed isomiRs. Some isomiRs are derived from imprecise processing of pre-miRNA precursors, leading to length variants. Additional variability is introduced by non-templated addition of bases at the ends or editing of internal bases, resulting in base differences relative to the template DNA sequence. We hypothesized that some component of the isomiR variation reported so far could be due to systematic technical noise and not real.ResultsWe have developed the XICRA pipeline to analyze small RNA sequencing data at the isomiR level. We exploited its ability to use single or merged reads to compare isomiR results derived from paired-end (PE) reads with those from single reads (SR) to address whether detectable sequence differences relative to canonical miRNAs found in isomiRs are true biological variations or the result of errors in sequencing. We have detected non-negligible systematic differences between SR and PE data which primarily affect putative internally edited isomiRs, and at a much smaller frequency terminal length changing isomiRs. This is relevant for the identification of true isomiRs in small RNA sequencing datasets.ConclusionsWe conclude that potential artifacts derived from sequencing errors and/or data processing could result in an overestimation of abundance and diversity of miRNA isoforms. Efforts in annotating the isomiRnome should take this into account.

Highlights

  • Generation sequencing has allowed the discovery of miRNA isoforms, termed isomiRs

  • One hundred replicates derived from independent runs of simulated read generation were used to generate simulated Read 1 (R1) and Read 2 (R2) reads in silico

  • We hypothesized that analysis of small RNA-seq PE data at the isomiR level is likely to contribute to discriminating resolution improvements in miRNA differential expression analysis

Read more

Summary

Introduction

Generation sequencing has allowed the discovery of miRNA isoforms, termed isomiRs. MicroRNAs (miRNAs), a class of small non-coding RNAs (ncRNAs), have an average length of 21–23 nucleotides (nt). They have been widely studied as endogenous regulatory molecules that modulate gene expression post-transcriptionally by inducing target mRNA silencing and decay [1]. One of the resulting strands (defined as the mature miRNA) binds to the protein Argonaut 2 (Ago2) and gets incorporated into the RNA Induced Silencing Complex (RISC) [4]. Target specificity for binding to mRNAs is mediated by the seed region (defined by miRNA nucleotide positions 2–8) [5], but other parts of the miRNA in central positions and offset bases have been shown to modulate miRNA functionality [6, 7] (Additional file 1: Fig. S1)

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call