Abstract

There are many high dimensional multi-view data for various complex and large-scaled applications in a big data environment. However, traditional clustering algorithms consider all features of data with equal relevance, which is difficult to deal with those high dimensional multi-view data. In order to address this challenge problem, we propose a novel approach named intelligent weighting k-means clustering approach (IWKM), which is based on swarm intelligence and k-means algorithm. Because of the sensitivity to initial clusters centers of k-means, IWKM algorithm utilizes the global search capability of swarm intelligence to find initial clusters centers, the weights of view and feature. Then the weighting k-means approach is applied to determine the clusters of objects with initial clusters centers, the weights of view and feature obtained by swarm intelligence. The character of IWKM is as follows: In the model of clustering, every view and feature have their own weights. The weights will affect object's assigned cluster. The weights of view and feature are calculated by swarm intelligent algorithm; At the same time, the degree of coupling between clusters is also introduced into the model of clustering to enlarge the dissimilarity of clusters. The comprehensive experiments are conducted on three high dimensional multi-view data from machine learning repository. The experimental results are put together with five other state-of-the-art clustering algorithms by the evaluation metrics of Rand index, Jaccard coefficient and Folkes Russel. The experiments reveal that our new approach can generate better clustering results when dealing with high dimensional multi-view data in a big data environment.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call