Finding hierarchy of clusters

Shankho Subhra Pal,Jayanta Mukhopadhyay,Sudeshna Sarkar

doi:10.1016/j.patrec.2023.12.009

Shankho Subhra Pal, Jayanta Mukhopadhyay + Show 1 more

https://doi.org/10.1016/j.patrec.2023.12.009

Copy DOI

Abstract

In this work a novel hierarchical clustering technique is proposed which can be used to find a hierarchical structure of data items in a dataset. The emphasis is on maintaining consistency of the clusters at each level in the hierarchy as well as to have a natural number of clusters at each level. We propose a method by using Cluster Number Assisted k-Means (CNAK) to find the possible natural numbers of clusters for that dataset. Next, we find the association between clusters at different levels. We propose three criteria and use them to remove the insignificant levels to obtain a final hierarchical structure. This method can be used to find the number of clusters at each of the different levels, the clusters at those levels and their association with clusters at adjacent levels. Naïve Hierarchical k-means algorithm uses a pre-determined branching factor at all the levels. In most traditional agglomerative and divisive clustering, at each different level of the hierarchical structure, exactly one node is split into several nodes for the next lower level. Either this number is the same for each parent or decided independently. Clusters with multiple parents are not allowed in traditional hierarchical clustering. However, in reality a cluster may naturally have multiple parents. Experimentation have been carried out on multiple datasets to demonstrate the effectiveness of the proposed technique.

Full Text