Abstract

High-throughput sequencing is a powerful tool for discovering and profiling microRNAs (miRNAs) to gain further insights into their biogenesis and function. Due to shorter size, short RNAs from deep sequencing dataset are prone to map to multiple loci with an equal number of mismatches, especially among multicopy miRNA precursors and homologous miRNA genes. Systematic analysis of SOLiD sequencing dataset showed that 37.94% short RNAs could simultaneously map to more than one miRNA precursor, and more short RNAs were found to have multiple genomic loci. Improper selection from candidate loci might lose some mapping information, influence miRNA expression profile or even mislead to identify novel miRNAs. A comprehensive study indicated several potential features for correction strategy: location and distribution of mismatches, quality values, expression profiles of multiple isomiRs (miRNA variants), miRNA* and moRs (miRNA-offset-RNAs) at candidate locus and in its flank sequence. Further studies should develop an approach to correct the widespread phenomenon of multiple mapping based on these features, and improve accuracy of profiling and discovering miRNAs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call