Analysis of the effect early cluster centre points on the combination of k-means algorithms and sum of squared error on k centroid

D Selvida,M Zarlis,Z Situmorang

doi:10.1088/1757-899x/725/1/012089

Abstract

K-Means clustering is a clustering algorithm based on a partition with data only entered into one grub K, the algorithm determines the number of grub at the beginning and defines the set of K centroid. The initial determination of the cluster center is very influential on the results of the clustering process in determining the quality of clustering. The results of better clustering are often obtained after several trials. Sum of Squared Error (SSE) is a representation of homogeneity or uniformity within a cluster. In this study, the Sum of Squared Error (SSE) was used as an approach to determine the center point of the initial cluster of the K-Means algorithm. Tests were carried out on 2 datasets and the number of centroids 2,3,4,5,6,7,8, and 9 obtained values of centroids 3 and 4 in iris data had better number of iterations using a combination of K-Means and Sum of Squared Error ( SSE). These results prove that the grouping with the method of determining the cluster center starting with the K-Means algorithm is based on the minimum Sum of Squared Error value that can improve clustering results and increase the value of Sum of Squared Error (SSE), compared to conventional cluster center points.

Full Text