Abstract

Traditional clustering algorithms clearly assign uncertain information into a single cluster, which does not fully indicate that a cluster may not have a clear boundary. For a large number of missing data, the traditional clustering method cannot achieve a good clustering effect on these datasets. Therefore, the idea of three-way decision is introduced into the traditional k-means clustering, as a result, the knowledge of set-pair information granule be combined. This paper presents a three-way clustering method which can process missing values effectively. First, for missing values, the granularity corresponding to missing values are recorded as the degree of difference. Next, the algorithm is going to establish the distance between the samples and the clustering centers according to the set-pair theory. All samples are assigned into clusters according to the size of the distance, and the clustering results with three-way are formed, which are positive region, boundary region and negative region, which improves the structure of clustering results. The samples of positive region certainly belong to this cluster; the samples of boundary region may belong to this cluster; the samples of negative region don’t belong to this cluster; and the clustering results are represented by the three regions together. Finally, the validity of the algorithm is verified by UCI dataset great work.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call