The microarray technology enables the analysis of the gene expression data and the understanding of the important biological processes in an efficient way. We have developed an efficient clustering scheme for microarray gene expression data based on correlation-based feature selection, ant-based clustering, fuzzy c-means algorithm and a novel heaps merging heuristic. The algorithm utilizes the feature selection algorithm to overcome the high-dimensionality problem encountered in bioinformatics domain. Based on extensive empirical analysis on microarray data, clustering quality of the ant-based clustering algorithm is enhanced with the use of fuzzy c-means algorithm and heaps merging heuristic. The performance of the proposed clustering scheme is compared with k-means, PAM algorithm, CLARA, self-organizing map, hierarchical clustering, divisive analysis clustering, self-organizing tree algorithm, hybrid hierarchical clustering, consensus clustering, AntClass algorithm and fuzzy c-means clustering algorithms. The experimental results indicate that the proposed clustering scheme yields better performance in clustering cancer gene expression data.
Read full abstract