Abstract

As one of the most common data mining techniques, clustering has been widely applied in many fields, among which fuzzy clustering can reflect the real world in a more objective perspective. As one of the most popular fuzzy clustering algorithms, Fuzzy C-Means (FCM) clustering combines the fuzzy theory and K-Means clustering algorithm. However, there are some issues with FCM clustering. For example, FCM is very sensitive to the initialization condition, such as the determination of initial clusters; the speed of convergence is limited, and the global optimal solution is hard to be guaranteed. Especially for big data scenario, the overall speed is slow, and it is hard to perform the clustering algorithm on all the original dataset. To solve above challenges, in this paper, we propose a modified FCM based on Particle Swarm Optimization (PSO). Besides, we also present a multi-round random sampling method to deal with the big data problem, by simulating the clustering on the original big dataset with the objective to approximate the clustering results on sample datasets. Our experiments show that both the modified FCM using PSO and the multi-round sampling strategy are efficient and effective.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call