Abstract

K-means clustering is a popular clustering algorithm based on the partition of data. However, there are some shortcomings of it, such as its requiring a user to give out the number of clusters at first, and its sensitiveness to initial conditions, and its easily getting to the trap of a local solution et cetera. The global K-means algorithm proposed by Likas et al is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure consisting of N (with N being the size of the data set) runs of the K-means algorithm from suitable initial positions. It avoids the depending on any initial conditions or parameters, and considerably outperforms the K-means algorithms, but it has a heavy computational load. In this paper, we propose a new version of the global K-means algorithm. The outstanding feature of our algorithm is its superiority in execution time. It takes less run time than that of the available global K-means algorithms. This great advantage is due to that we improved the way of creating the next cluster center in the global K-means algorithm. We defined a novel function to select the optimal candidate center for the next cluster enlightened by the idea of K-medoids clustering algorithm suggested by Park and Jun. Experiments on some well-known data sets from UCI and on a synthetic data set show that our new algorithm can significantly reduce the computational time without affecting the performance of the global K-means algorithm. The further experiments demonstrate that our improved algorithm outperforms the global K-means algorithm greatly.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.