Abstract

Cluster analysis is the process of grouping a number of objects based on information obtained from data that explains the relationship between objects with the principle of maximizing similarities between members of one cluster and minimizing similarities between clusters. Cluster analysis is useful for identifying objects (recognition), supporting decision-making systems, and data mining. Cluster analysis consists of hierarchical (Average Linkage, Single Linkage, Complete Linkage, Ward's, and Centroid) and non-hierarchical (K-Means) methods. Each method generally has advantages and disadvantages. Apart from that, there are several distance measures that are commonly used in the grouping process, such as Euclidean, Canberra Metric, Czekanowski Coefficient, and others. In general, researchers will choose one or several cluster analysis methods as a comparison and a certain distance measure to be applied to the data in order to group objects based on certain criteria. In this research, a study and evaluation of Euclidean distance measures, Canberra Metric, and Czekanowski Coefficient were carried out using the Complete Linkage method based on simulated data. The conclusion obtained from evaluating measures of object similarity, namely Euclidean distance, Canberra Metric, and Czekanowski Coefficient by applying the Complete Linkage method, concluded that Euclidean distance is better used as a measure of object similarity in grouping cases compared to Canberra Metric and Czekanowski Coefficient.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call