A weighted k-modes clustering using new weighting method based on within-cluster and between-cluster impurity measures

Kyoungok Kim

doi:10.3233/jifs-16157

Abstract

Partitioning a set of objects into groups or clusters is a fundamental task in data mining, and clustering is a popular approach to implementing partitioning. Among several clustering algorithms, the k-means algorithm is well-known and widely applied in several areas that only handle numerical attr ibutes. The k-modes algorithm is an extension of the k-means algorithm that deals with categorical variables, which has several variations such as fuzzy methods. This paper presents a new attribute weighting method for the k-modes algorithm that utilizes impurity measures such as entropy and Gini impurity. The proposed algorithm considers both the distribution of categories of attributes within the same cluster and between different clusters. By doing this, categorical variables defined as more important that others by the new algorithm have a significant influence on the similarity calculation, and this results in improved clustering performance, which was confirmed by experiments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A weighted k-modes clustering using new weighting method based on within-cluster and between-cluster impurity measures

Abstract

Talk to us

Similar Papers

More From: Journal of Intelligent & Fuzzy Systems

Lead the way for us

Journal: Journal of Intelligent & Fuzzy Systems	Publication Date: Jan 13, 2017
Citations: 6

Similar Papers

A Review on K-Mode Clustering Algorithm
Manisha Goyal
International Journal of Advanced Research in Computer Science | VOL. 8
Manisha GoyalManisha Goyal
20 Aug 2017
International Journal of Advanced Research in Computer Science | VOL. 8

K-Distributions: A New Algorithm for Clustering Categorical Data
Zhihua Cai ... Liangxiao Jiang
-
Zhihua Cai, et. al.Zhihua Cai ... Liangxiao Jiang
01 Jan 2007
01 Jan 2007

Deer hunting optimization technique for clustering unsupervised data in data mining
Hayder Hussein Azeez
International Journal of Modeling, Simulation, and Scientific Computing | VOL. 14
Hayder Hussein AzeezHayder Hussein Azeez
11 Jun 2022
International Journal of Modeling, Simulation, and Scientific Computing | VOL. 14

A Clustering Algorithm Based on Symmetric Neighborhood of Micro-clusters
Yu Zhang ... Dechang Pi
-
Yu Zhang, et. al. Yu Zhang ... Dechang Pi
01 Mar 2009
01 Mar 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A weighted k-modes clustering using new weighting method based on within-cluster and between-cluster impurity measures

Abstract

Talk to us

Similar Papers

More From: Journal of Intelligent &amp; Fuzzy Systems

More From: Journal of Intelligent & Fuzzy Systems