An equi-biased k-prototypes algorithm for clustering mixed-type data

Ravi Sankar Sangam,Hari Om

doi:10.1007/s12046-018-0823-0

Abstract

Clustering has been recognized as a very important approach for data analysis that partitions the data according to some (dis)similarity criterion. In recent years, the problem of clustering mixed-type data has attracted many researchers. The k-prototypes algorithm is well known for its scalability in this respect. In this paper, the limitations of dissimilarity coefficient used in the k-prototypes algorithm are discussed with some illustrative examples. We propose a new hybrid dissimilarity coefficient for k-prototypes algorithm, which can be applied to the data with numerical, categorical and mixed attributes. Besides retaining the scalability of the k-prototypes algorithm in our method, the dissimilarity functions for either-type attributes are defined on the same scale with respect to their dimensionality, which is very beneficial to improve the efficiency of clustering result. The efficacy of our method is shown by experiments on real and synthetic data sets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

An equi-biased k-prototypes algorithm for clustering mixed-type data

Abstract

Talk to us

Similar Papers

More From: Sādhanā

Lead the way for us

Journal: Sādhanā	Publication Date: Mar 1, 2018
Citations: 14

Similar Papers

A Fast K-prototypes Algorithm Using Partial Distance Computation
Byoungwook Kim
Symmetry | VOL. 9
Byoungwook KimByoungwook Kim
21 Apr 2017
Symmetry | VOL. 9

Determining the number of clusters using information entropy for mixed data
Jiye Liang ... Fuyuan Cao
Pattern Recognition | VOL. 45
Jiye Liang, et. al.Jiye Liang ... Fuyuan Cao
24 Dec 2011
Pattern Recognition | VOL. 45

An Efficient Grid-Based K-Prototypes Algorithm for Sustainable Decision-Making on Spatial Objects
Hong-Jun Jang ... Soon-Young Jung
Sustainability | VOL. 10
Hong-Jun Jang, et. al.Hong-Jun Jang ... Soon-Young Jung
25 Jul 2018
Sustainability | VOL. 10

Comparison of distance and dissimilarity measures for clustering data with mix attribute types
Hermawan Prasetyo ... Ayu Purwarianti
-
Hermawan Prasetyo, et. al.Hermawan Prasetyo ... Ayu Purwarianti
01 Nov 2014
01 Nov 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An equi-biased k-prototypes algorithm for clustering mixed-type data

Abstract

Talk to us

Similar Papers

More From: Sādhanā