Abstract

Class imbalance in machine learning is when there are significantly fewer training instances of one class in comparison to another one. In bioinformatics, there is such a problem in the computational prediction of novel microRNA (miRNAs) within a full genome. The well-known precursors miRNA (pre-miRNA) are usually only a few in comparison to the hundreds of thousands of potential candidates, which makes this task a high class imbalance classification problem. It is well-known that high class imbalance usually affects any classical supervised machine learning classifier. Thus the imbalance must be explicitly considered. Extreme Learning Machine (ELM) is a supervised artificial neural network model that has gained interest in the last years because of its high learning rate and performance. In this work, we propose a novel approach to overcome the high class imbalance in pre-miRNAs prediction data in which ELMs are used for predicting good candidates to pre-miRNA, without needing balanced data sets. Real datasets were used for validation of the proposal with several class imbalance levels. The results obtained showed the superiority of the ELM approach against very recent state-of-the-art methods in the same experimental conditions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call