Abstract

In this research, a hybrid wrapper model is proposed to identify the featured gene subset from the gene expression data. To balance the gap between exploration and exploitation, a hybrid model with a popular meta-heuristic algorithm named spider monkey optimizer (SMO) and simulated annealing (SA) is applied. In the proposed model, ReliefF is used as a filter to obtain the relevant gene subset from dataset by removing the noise and outliers prior to feeding the data to the wrapper SMO. To enhance the quality of the solution, simulated annealing is deployed as local search with the SMO in the second phase, which will guide to the detection of the most optimal feature subset. To evaluate the performance of the proposed model, support vector machine (SVM) as a fitness function to recognize the most informative biomarker gene from the cancer datasets along with University of California, Irvine (UCI) datasets. To further evaluate the model, 4 different classifiers (SVM, na¨ıve Bayes (NB), decision tree (DT), and k-nearest neighbors (KNN)) are used. From the experimental results and analysis, it’s noteworthy to accept that the ReliefF-SO-SA-SVM performs relatively better than its state-of-the-art counterparts. For cancer datasets, our model performs better in terms of accuracy with a maximum of 99.45%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call