A K-Means Clustering Algorithm Based on Double Attributes of Objects

Tu Linli,Chu Siyong,Deng Yanni

doi:10.1109/icmtma.2015.12

Abstract

The K-means clustering algorithm have played an important role in the data analysis, pattern recognition, image processing, and market research. Classical K-means algorithm randomly selected initial cluster centers, so that the clustering results unstable. In this paper, through deeply study on classical k-means algorithm, we proposed a new K - means algorithm of Clustering based on double attributes of objects. The algorithm is based on the dissimilarity degree matrix which generated by high density set to construct the Huffman tree, and then according to K value to select initial cluster centers points in the Huffman tree, using this method effectively overcomes the defects of classical K-means algorithm for clustering random selection caused the initial cluster centers result unstable defects. In this paper, the new algorithm uses two UCI data sets to validate. The results of experiment show that the new k-means algorithm can choose the initial cluster center of high quality stable, so as to get better clustering results.

Full Text