Abstract

As one of the most common data mining techniques, clustering has been widely applied in many fields, among which fuzzy clustering can reflect the real world in a more objective perspective. As one of the most popular fuzzy clustering algorithms, Fuzzy C-Means (FCM) clustering combines the fuzzy theory and K-Means clustering algorithm. However, there are some issues with FCM clustering. For example, FCM is very sensitive to the initialization condition, such as the determination of initial clusters; the speed of convergence is limited, and the global optimal solution is hard to be guaranteed. Especially for big data scenario, the overall speed is slow, and it is hard to perform the clustering algorithm on all the original dataset. To solve above challenges, in this paper, we propose a modified FCM based on Particle Swarm Optimization (PSO). Besides, we also present a multi-round random sampling method to deal with the big data problem, by simulating the clustering on the original big dataset with the objective to approximate the clustering results on sample datasets. Our experiments show that both the modified FCM using PSO and the multi-round sampling strategy are efficient and effective.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.