Abstract

Data clustering is a popular data mining technique for discovering the structure of a data set. However, the power of the results depends on the nature of the clusters prototypes generated by the clustering technique. Some cluster algorithms just label the data points producing a prototype for the cluster as the full set of data points belonging to the clusters. Some techniques produce a single ’abstract’ data point as the model for the full cluster losing the information of the shape, size and structure of the cluster. This paper proposes an on-line cluster prototype generation mechanism for the Gravitational Clustering algorithm. The idea is to use the gravitational system dynamic and the inherent hierarchical property of the gravitational algorithm for determining some summarized prototypes of clusters at the same time the gravitational clustering algorithm is finding such clusters. In this way, a cluster is represented by several different ’abstract’ data points allowing the algorithm to find an appropriated representation of clusters that are found. The performance of the proposed mechanism is evaluated experimentally on two types of synthetic data sets: data sets with Gaussian clusters and with non parametric clusters. Our results show that the proposed mechanism is able to deal with noise, finds the appropriated number of clusters and finds an appropriated set of cluster prototypes.Keywordsclusteringgravitationalhierarchicalprototype

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call