HFSMOOK-Means: An Improved K-Means Algorithm Using Hesitant Fuzzy Sets and Multi-objective Optimization

Kamran Rezaei,Hassan Rezaei

doi:10.1007/s13369-020-04620-5

Abstract

Clustering is considered as one of the important methods in data mining. The performance of the K-means algorithm, as one of the most common clustering methods, is high sensitivity to the initial cluster centers. Hence, selecting appropriate initial cluster centers for implementing the algorithm improves clustering resulted from the algorithm. The present study aims to find suitable initial cluster centers for the K-means. In fact, the initial cluster centers should be selected in such a way that clusters with high separation and high density can be obtained. Therefore, in this paper, finding initial cluster centers is considered as a multi-objective optimization problem through maximizing the distance between the initial cluster centers, as well as the neighbor density of the initial cluster centers. Solving the above problem through using the MOPSO algorithm provided a set of initial cluster centers of the candidate. Then, the hesitant fuzzy sets were used to evaluate the clusters generated from initial cluster centers by considering separation, cohesion and silhouette index. After that, the concept of informational energy of hesitant fuzzy sets is used, by which non-dominated particles in the Pareto optimal set were ranked and the initial cluster centers were selected for starting the K-means algorithm. The proposed HFSMOOK-means method was compared with several clustering algorithms by considering common and widely used criteria. The results indicated the successful performance of HFSMOOK-means in the majority of the datasets compared to the other algorithms.

Full Text