Abstract
In the past years, several clustering algorithms have been developed, for example, K-means, K-medoid. Most of these algorithms have the common problem of selecting the appropriate number of clusters and these algorithms are sensitive to noisy data and would cause less accurate clustering of the data set. Therefore, this paper introduces a new Hybrid Grid-based Gravitational Clustering Algorithm (HGGCA) geometrically, which can automatically detect the number of clusters of the targeted data set and find the clusters with any arbitrary forms and filter the noisy data. This proposed clustering algorithm is used to move the cluster centers to the areas where the data density is high based on Newton’s law of gravity and Newton’s laws of motion. Also, the proposed method has higher accuracy than the existing K-means and K-medoids methods which is shown in the experimental result. In this study, we used cluster-validity-indicators to verify the validity of the proposed and existing methods of clustering. Experimental results show that the proposed algorithm massively creates high-quality clusters.
Highlights
Clustering is arguably the most significant unsupervised learning problem
Clustering is a task of combining similar objects in one group and dissimilar objects in another group (Han, 2006)
P i 1 xi, yi where, P is the total number of data points within the ith grid: Step 3.3: Update the grid center Ci of each grid by Newton’s law of gravity and Newton’s law of motion
Summary
Clustering is arguably the most significant unsupervised learning problem. Clustering is a task of combining similar objects in one group and dissimilar objects in another group (Han, 2006). Finding similarities between data according to their characteristics can be done by cluster analysis. A few of them are the partitioning method, hierarchical method and density-based method, Grid-based method, Gravitational clustering method (Thammano and Sangkapas, 2011; Gomez et al, 2003), Model-based method, Constrainedbased method (Jain et al, 1999). In existing k-means and k-medoids methods, the determination of the value of k (number of clusters) is required before clustering is a difficult task. Our focus was on grid-based and gravitational clustering methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have