Abstract

Agglomerative hierarchical clustering algorithms for finding groups in unlabelled data achieve different clusters of objects depending on how the similarity between two clusters is measured. In this context, the linkage method refers to the process to decide if two clusters are merged. However, it is not possible to establish in advance the linkage method that provides the best grouping of the data. In the classic hierarchical clustering algorithm, the two most similar clusters according to one linkage method are merged together into a single cluster. In this work, we propose a hierarchical clustering algorithm that aggregates in each step the criteria of the single, complete and average linkage methods in order to determine the two most similar clusters to merge. In each step, each linkage method gives a ranking of pairs of clusters, where the pairs are ordered from most to least similar. The rankings given by each linkage method are aggregated using ranking rules to achieve a consensus of the different criteria about which one is the pair to be merged into a single cluster. Results obtained from the validation of the new algorithm with the Rand Index metric in data sets with different characteristics show how the proposed algorithm is useful for reducing the impact of the linkage method chosen in the final clusters obtained.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.