Abstract

In this study, the distance-based agglomerative hierarchical clustering techniques were compared to a ratio-based approach. Two real datasets, which were also used in a prior study by Roux (2018), were considered. Firstly, it was observed that the type of scaling applied to the datasets was found to affect the results of hierarchical clustering. Thus, various scaling methods were employed prior to implementing hierarchical clustering. Furthermore, two rank-based goodness-of-fit measures were used to evaluate the hierarchical clustering methods. In contrast to Roux (2018) findings, it was observed that the distance-based methods, such as Median linkage, Average linkage, and centroid linkage, performed better than the ratio-based method. The ratio-based methods also showed issues with branch crossing in the hierarchical clustering dendrogram. Consequently, this study illustrates that, with appropriate dataset scaling, the distance-based methods outperform ratio-based methods in terms of goodness-of-fit measures.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call