Abstract

Single Nucleotide Polymorphisms (SNPs) are the most common form of genetic variation in humans comprising nearly 1/1,000th of the average human genome. The intelligent analysis of databases may be affected by the presence of unimportant features, which motivates the application of feature selection. In this work, we have proposed a genetic based feature selection. Genetic algorithm (GA) is a search heuristic that mimics the process of natural selection. This heuristic is routinely used to generate useful solutions to optimization and search problems. Clustering is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. Bee Colony optimization (BCO) algorithm is a population-based search algorithm. It mimics the food foraging behaviour of honey bee colonies. In its basic version the algorithm performs a kind of neighbourhood search combined with global search, and can be used for both combinatorial optimization and continuous optimization. In this paper the feature selection approach Genetic clustering with BCO was successfully applied to Leukamia cancer data sets. The feature selection approach has resulted in 80% reduction in number of features. The accuracy and specificity for the significant gene/SNP set was 70% and 82%, respectively. The number of features has been considerably reduced while the quality of knowledge was enhanced.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call