Abstract

AbstractFor large-scale data sets, the traditional AP clustering algorithm processes all data samples through the idea of iteration and three metrics calculation. This kind of processing will lead to poor clustering effect and high time consumption. Based on this problem, an algorithm which is called KCAP based on Affinity Propagation clustering and Canopy clustering algorithm was proposed. Firstly, KNN distance was used to calculate the local density for each data. Secondly, we used the Canopy algorithm to roughly cluster the data sets, and the center points were formed into a new data set \(Y\). Then, we applied affinity propagation clustering algorithm to cluster the data set \(Y\). Finally, we assigned the data to the nearest center. Compared with the traditional AP algorithm, K-means algorithm and FCM algorithm, the experimental results on synthetic data sets and UCI data sets show that algorithm KCAP not only improves the clustering accuracy, but also have higher efficiency.KeywordsK-Nearest Neighbors (KNN)Initial centerCanopy algorithmAffinity Propagation (AP)Rough clustering

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call