Abstract

Recent advances in Genomics technologies has enabled cancer research to enter the Big Data era. In Big data analytics, both dimensionality reduction and designing an optimal analysis model is a challenging task. This paper presents an effective two phase feature selection model to effectively address this challenge in multi-class cancer classification problem. In the first phase, the model utilizes ensemble of filters to reduce the curse of dimensionality eliminating the irrelevant genes and maintaining the prognostic genes. In second phase, hybrid particle swarm optimization (HPSO) is employed to further reduce the dimensionality of prognostic genes and optimize the classifier structure synchronously to achieve optimal classification accuracy. To confirm the effectiveness of the proposed model, it is compared with other well-known recent methods on five benchmark microarray datasets using 10-fold cross validation. Experimental results demonstrates its superior performance in selecting compact prognostic gene subset while maintaining the classification accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call