Abstract

MicroRNAs (miRNAs) are important regulatory molecules in eukaryotic organisms. Existing methods for the identification of mature miRNA sequences in plants rely extensively on the search for stem-loop structures, leading to high false negative rates. Here, we describe a probabilistic method for ranking putative plant miRNAs using a naïve Bayes classifier and its publicly available implementation. We use a number of properties to construct the classifier, including sequence length, number of observations, existence of detectable predicted miRNA* sequences, the distribution of nearby reads and mapping multiplicity. We apply the method to small RNA sequence data from soybean, peach, Arabidopsis and rice and provide experimental validation of several predictions in soybean. The approach performs well overall and strongly enriches for known miRNAs over other types of sequences. By utilizing a Bayesian approach to rank putative miRNAs, our method is able to score miRNAs that would be eliminated by other methods, such as those that have low counts or lack detectable miRNA* sequences. As a result, we are able to detect several soybean miRNA candidates, including some that are 24 nucleotides long, a class that is almost universally eliminated by other methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.