Abstract

The k-means algorithm is one of the most widely used methods to partition a dataset into groups of patterns. However, most k-means methods require expensive distance calculations of centroids to achieve convergence. In this paper, we present an efficient algorithm to implement a k-means clustering that produces clusters comparable to slower methods. In our algorithm, we partition the original dataset into blocks; each block unit, called a unit block (UB), contains at least one pattern. We can locate the centroid of a unit block (CUB) by using a simple calculation. All the computed CUBs form a reduced dataset that represents the original dataset. The reduced dataset is then used to compute the final centroid of the original dataset. We only need to examine each UB on the boundary of candidate clusters to find the closest final centroid for every pattern in the UB. In this way, we can dramatically reduce the time for calculating final converged centroids. In our experiments, this algorithm produces comparable clustering results as other k-means algorithms, but with much better performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.