A STUDY AND ANALYSIS OF CLUSTERING ALGORITHMS ON HIV-1 INFECTION MICROARRAY DATASET FOR FINDING CLUSTER WISE COMMON GENES

Uma M ,R Porkodi

doi:10.26483/ijarcs.v8i5.4049

Abstract

Data mining refers to collecting or mining knowledge from large amounts of data. It is used in various medical applications like tumor clustering, protein structure prediction, gene selection, cancer classification based on microarray data, clustering of gene expression data, statistical model of protein-protein interaction etc. The analyzing the clustering algorithms phase consist of four clustering algorithms namely K-means, Fuzzy c–means, Hierarchical algorithm and Partitioning Around Medoids(PAM) on HIV – 1 infection effect on macrophages in vitro time course microarray data set. The clustering algorithms are validated using validation measures and based on internal validation measures such as Dunn index, Dunn index 2, Calinski-Harabasz index and Average Silhouette width, the best clustering algorithm out of 4 is to be identified and finally the proposed research work is also to find common genes present in each cluster produced by the four clustering algorithms.

Full Text