Abstract

Clustering in data mining is a discovery process that groups a set of data such that the intra-cluster similarity is maximized and the inter-cluster similarity is minimized. Existing clustering algorithms, such as K-means, are designed find but these algorithms can break down if the choice of parameters in the static model is incorrect with respect the data set being clustered, or if the model is not adequate capture the characteristics of clusters. Furthermore, most of these algorithms break down when the data consists of that are of diverse shapes, densities and sizes. In this paper, a novel clustering algorithm has been presented that the data set in O(n) time taking O(n) space and that too without specifying the stopping criteria with respect data set be clustered (unlike done in k-means explicitly specify the value of k). The algorithm first normalizes the data set, and then a proper mesh has be designed include the whole data set. Then all the points in a data set are assigned different box numbers and then these boxes are clustered instead of the real points. The algorithm doesn't use any distance measure cluster points like Euclidean distance. Apart from independent clustering algorithm, it can be used for the upper bound of the to be clusters in other algorithms like k-means, isodata etc.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.