Clustering by partitioning around medoids using distance-based similarity measures on interval-scaled variables

D.L Nkweteyim

doi:10.4314/njtd.v15i1.1

Abstract

It is reported in this paper, the results of a study of the partitioning around medoids (PAM) clustering algorithm applied to four datasets, both standardized and not, and of varying sizes and numbers of clusters. The angular distance proximity measure in addition to the two more traditional proximity measures, namely the Euclidean distance and Manhattan distance, was used to compute object-object similarity. The data used in the study comprise three widely available datasets, and one that was constructed from publicly available climate data. Results replicate some of the well known facts about the PAM algorithm, namely that the quality of the clusters generated tend to be much better for small datasets, that the silhouette value is a good, even if not perfect, guide for the optimal number of clusters to generate, and that human intervention is required to interpret generated clusters. Additionally, results also indicate that the angular distance measure, which traditionally has not been widely used in clustering, outperforms both the Euclidean and Manhattan distance metrics in certain situations.Keywords: PAM, Euclidean, Manhattan, Angular distance, Silhouette

Highlights

IntroductionCluster analysis (or clustering) is an unsupervised machine learning task used to find structure in unlabelled data
Cluster analysis is an unsupervised machine learning task used to find structure in unlabelled data
Interpretation of generated clusters often requires human intervention to explain patterns that are common to members of the clusters

Summary

Introduction

Cluster analysis (or clustering) is an unsupervised machine learning task used to find structure in unlabelled data. The clustering task groups a set of objects in such a way that objects in the same group (called a cluster) are more similar to each other than to those in other clusters (Aldenderfer and Blashfield, 1984; Han et al, 2006). Several clustering approaches have been developed to address different types of data. These include: partitioning approaches, hierarchical approaches, density-based methods, grid-based methods, model-based methods, special techniques for clustering high-dimensional data, and constraint-based clustering (Han et al, 2006; Yinghua et al, 2016).

Objectives

Methods

Results

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Clustering by partitioning around medoids using distance-based similarity measures on interval-scaled variables

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nigerian Journal of Technological Development

Lead the way for us

Journal: Nigerian Journal of Technological Development	Publication Date: Mar 7, 2018
License type: cc-by

Similar Papers

Analysis of shape alignment using Euclidean and Manhattan distance metrics
Rohit Kumar
-
Rohit KumarRohit Kumar
01 Oct 2017
01 Oct 2017

Simulated Annealing Partitioning: An Algorithm for Optimizing Grouping in Cancer Data
Ran Qi ... Shujia Zhou
-
Ran Qi, et. al.Ran Qi ... Shujia Zhou
01 Dec 2013
01 Dec 2013

An Improvement of K-Medoids Clustering Algorithm Based on Fixed Point Iteration
Xiaodi Huang ... Zhongfeng Hu
International Journal of Data Warehousing and Mining | VOL. 16
Xiaodi Huang, et. al.Xiaodi Huang ... Zhongfeng Hu
01 Oct 2020
International Journal of Data Warehousing and Mining | VOL. 16

Fuzzy Hyperline Segment Neural Network Pattern Classifier with Different Distance Metrics
K S.Kadam ... S B Bagal
International Journal of Computer Applications | VOL. 95
K S.Kadam, et. al.K S.Kadam ... S B Bagal
18 Jun 2014
International Journal of Computer Applications | VOL. 95

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Clustering by partitioning around medoids using distance-based similarity measures on interval-scaled variables

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Nigerian Journal of Technological Development