
In k-means algorithm, initial cluster centroids are selected arbitrarily which leads to diverse formation of clusters in each run. Consequently, accuracy and performance of k-means is majorly depends on the selection of initial centroids. Thus, the initial cluster centroids shall be chosen carefully to obtain better accuracy and performance of k-means algorithm. In view of this, a new Modified Partition based Cluster Initialization method for k-means called as MP-k-means is proposed in this paper. MP-k-means is an amended version of P-k-means [1] in which the range of values of each dimension is divided into ‘k’ equi-sized partition based on arithmetic average. This division of range into ‘k’ equi-sized partition is affected by outliers present in the data. In order to remove the effect of outliers in P-k-means, the partitioning of each dimension is made based on positional average instead of arithmetic average in MP-k-means. Six popular datasets are used for empirical evaluation of the algorithms. The empirical results are compared and validated based on various external and internal clustering validation measures. The comparative results show that MP-k-means is significantly superior to the basic k-means and P-k-means. The proposed method may also be applied to other clustering algorithms which are based on the concept of selection of initial cluster centroids.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call