Abstract

In this work a novel hierarchical clustering technique is proposed which can be used to find a hierarchical structure of data items in a dataset. The emphasis is on maintaining consistency of the clusters at each level in the hierarchy as well as to have a natural number of clusters at each level. We propose a method by using Cluster Number Assisted k-Means (CNAK) to find the possible natural numbers of clusters for that dataset. Next, we find the association between clusters at different levels. We propose three criteria and use them to remove the insignificant levels to obtain a final hierarchical structure. This method can be used to find the number of clusters at each of the different levels, the clusters at those levels and their association with clusters at adjacent levels. Naïve Hierarchical k-means algorithm uses a pre-determined branching factor at all the levels. In most traditional agglomerative and divisive clustering, at each different level of the hierarchical structure, exactly one node is split into several nodes for the next lower level. Either this number is the same for each parent or decided independently. Clusters with multiple parents are not allowed in traditional hierarchical clustering. However, in reality a cluster may naturally have multiple parents. Experimentation have been carried out on multiple datasets to demonstrate the effectiveness of the proposed technique.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.