Abstract

Cancer classification from microarray gene expression data is one of the important areas of research in the field of computational biology and bioinformatics. Traditional supervised techniques often fail to produce desired accuracy as the number of clinically labeled patterns are very less. In such situation, active learning technique can play an important role as it computationally selects only few most informative (confusing) samples to be labeled by the experts and are added to the training set which inturn can improve the accuracy of the prediction. In this work a novel active learning method using rough-fuzzy classifier (ALRFC) is proposed for cancer sample classification using gene expression data. The proposed technique can handle uncertainty, overlappingness, and indiscernibility usually present in the subtype classes of the gene expression data. The proposed algorithm is tested using different publicly available benchmark cancer datasets and the performance is compared of the proposed method with three other active learning techniques, one semi-supervised classification algorithm, and two (non-active) supervised counterpart learning techniques in terms of prediction accuracy, precision, recall, F1-measures and kappa. Superiority of the proposed method for cancer prediction over the other state-of-art techniques is established from the experimental results. Statistical significance of the better results achieved by the proposed method (in comparison to other methods) is also confirmed from the paired t-test results for most of the datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call