Abstract

Clustering is a data mining technique used to make groups of objects that are somehow similar in characteristics. The criterion for checking the similarity is implementation dependent.Clustering analyzes data objects without consulting a known class label or category i.e. it is an unsupervised data mining technique. K-means is a widely used clustering algorithm that chooses random cluster centers (centroid), one for each centroid. The performance of K-means strongly depends on the initial guess of centers (centroid) and the final cluster centroids may not be the optimal ones as the algorithm can converge to local optimal solutions. Therefore it is important for K-means to have good choice of initial centroids. An algorithm for clustering that selects initial centroids using criteria of finding sum of distances of data objects to all other data objects have been formed. The proposed algorithm results in better clustering on synthetic as well as real datasets when compared to the K-means technique.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.