Abstract

Cancer is a group of related diseases with high mortality rate characterized by abnormal cell growth which attacks the body tissues. Microarray cancer data is a prominent research topic across many disciplines focused to address problems related to the higher curse of dimensionality, a small number of samples, noisy data and imbalance class. A random forest (RF) tree-based feature selection and ensemble learning based on hard voting and soft voting is proposed to classify microarray cancer data using six different base classifiers. The selected features due to RF tree are submitted to the base classifiers as the training set. Then, an ensemble learning method is applied to the base classifiers in which case each base classifier predicts class label individually. The final prediction is carried out hard and soft voting techniques that use majority voting and weighted probability on the test set. The proposed ensemble learning method is validated on eight different standard microarray cancer datasets, of which three of the datasets are binary class and the remaining five datasets are multi-class datasets. Experimental results of the proposed method show 1.00 classification accuracy on six of the datasets and 0.96 on two of the datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call