The discriminant power of RNA features for pre-miRNA recognition.

Ivani De On Lopes,Alexander Schliep,André Cp De Lf De Carvalho

doi:10.1186/1471-2105-15-124

Abstract

BackgroundComputational discovery of microRNAs (miRNA) is based on pre-determined sets of features from miRNA precursors (pre-miRNA). Some feature sets are composed of sequence-structure patterns commonly found in pre-miRNAs, while others are a combination of more sophisticated RNA features. In this work, we analyze the discriminant power of seven feature sets, which are used in six pre-miRNA prediction tools. The analysis is based on the classification performance achieved with these feature sets for the training algorithms used in these tools. We also evaluate feature discrimination through the F-score and feature importance in the induction of random forests.ResultsSmall or non-significant differences were found among the estimated classification performances of classifiers induced using sets with diversification of features, despite the wide differences in their dimension. Inspired in these results, we obtained a lower-dimensional feature set, which achieved a sensitivity of 90% and a specificity of 95%. These estimates are within 0.1% of the maximal values obtained with any feature set (SELECT, Section “Results and discussion”) while it is 34 times faster to compute. Even compared to another feature set (FS2, see Section “Results and discussion”), which is the computationally least expensive feature set of those from the literature which perform within 0.1% of the maximal values, it is 34 times faster to compute. The results obtained by the tools used as references in the experiments carried out showed that five out of these six tools have lower sensitivity or specificity.ConclusionIn miRNA discovery the number of putative miRNA loci is in the order of millions. Analysis of putative pre-miRNAs using a computationally expensive feature set would be wasteful or even unfeasible for large genomes. In this work, we propose a relatively inexpensive feature set and explore most of the learning aspects implemented in current ab-initio pre-miRNA prediction tools, which may lead to the development of efficient ab-initio pre-miRNA discovery tools.The material to reproduce the main results from this paper can be downloaded from http://bioinformatics.rutgers.edu/Static/Software/discriminant.tar.gz.

Highlights

Computational discovery of microRNAs is based on pre-determined sets of features from miRNA precursors
A microRNA is a small non-coding RNA molecule that modulates the stability of messengers RNAs (mRNAs) targets and their rate of translation into proteins [1]
Aiming to recommend effective and less costly sets of features, we investigated the discriminant power of seven RNA feature sets, under controlled sources of variation

Summary

Introduction

Computational discovery of microRNAs (miRNA) is based on pre-determined sets of features from miRNA precursors (pre-miRNA). We analyze the discriminant power of seven feature sets, which are used in six pre-miRNA prediction tools. Maturation of canonical miRNAs occurs in two steps: First, the long primary miRNA transcript is processed within the nucleus into a ∼60–120 nucleotides (nt) stem-loop hairpin precursor (pre-miRNA) by the enzyme Drosha [6]. The loop is degraded as a by-product [7], whereas the RNA duplex is unwound by helicase activity, releasing the mature miRNA and the star sequence [6]. The last is typically degraded whereas the mature miRNA guides the microribonucleo-protein complex (miRNP) to target messengers RNAs (mRNAs) by partial sequence complementarity [7]

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: May 2, 2014
Citations: 77	License type: cc-by

R Discovery Prime

R Discovery Prime

The discriminant power of RNA features for pre-miRNA recognition.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Retweet Prediction Based on Heterogeneous Data Sources: The Combination of Text and Multilayer Network Features
Ana Meštrović ... Slobodan Beliga
Applied Sciences | VOL. 12
Ana Meštrović, et. al.Ana Meštrović ... Slobodan Beliga
05 Nov 2022
Applied Sciences | VOL. 12

Towards a Standard Feature Set for Network Intrusion Detection System Datasets
Mohanad Sarhan ... Siamak Layeghy
Mobile Networks and Applications | VOL. 27
Mohanad Sarhan, et. al.Mohanad Sarhan ... Siamak Layeghy
10 Nov 2021
Mobile Networks and Applications | VOL. 27

Study of stability of time-domain features for electromyographic pattern recognition
Dennis Tkach ... He Huang
Journal of NeuroEngineering and Rehabilitation | VOL. 7
Dennis Tkach, et. al.Dennis Tkach ... He Huang
21 May 2010
Journal of NeuroEngineering and Rehabilitation | VOL. 7

Prediction of intrapartum fetal hypoxia considering feature selection algorithms and machine learning models.
Zafer Cömert ... Adnan Fatih Kocamaz
Health information science and systems | VOL. 7
Zafer Cömert, et. al.Zafer Cömert ... Adnan Fatih Kocamaz
20 Aug 2019
Health information science and systems | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The discriminant power of RNA features for pre-miRNA recognition.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics