A Novel Artificial Bee Colony Based Clustering Algorithm for Categorical Data

Jinchao Ji,Yanlin Zheng,Wei Pang,Zhe Wang,Zhiqiang Ma

doi:10.1371/journal.pone.0127125

Abstract

Data with categorical attributes are ubiquitous in the real world. However, existing partitional clustering algorithms for categorical data are prone to fall into local optima. To address this issue, in this paper we propose a novel clustering algorithm, ABC-K-Modes (Artificial Bee Colony clustering based on K-Modes), based on the traditional k-modes clustering algorithm and the artificial bee colony approach. In our approach, we first introduce a one-step k-modes procedure, and then integrate this procedure with the artificial bee colony approach to deal with categorical data. In the search process performed by scout bees, we adopt the multi-source search inspired by the idea of batch processing to accelerate the convergence of ABC-K-Modes. The performance of ABC-K-Modes is evaluated by a series of experiments in comparison with that of the other popular algorithms for categorical data.

Highlights

As an important technique in data mining, clustering analysis has been used in many fields [1,2], such as information retrieval [3], social media analysis [4], privacy preserving [5], image analysis [6], text analysis [7], and bioinformatics [8]
For evaluating the performance of our proposed clustering algorithm artificial bee colony (ABC)-KModes, we run the proposed approach on six real-world categorical datasets: Zoo, Breast cancer, Soybean, Lung cancer, Mushroom, and Dermatology, all of which can be downloaded from UCI Machine Learning Repository
The parameters of the proposed ABC-K-Modes algorithm are set as follows: N = 20, MCN = 1000, which are the typical values used in the original ABC algorithm [30]; L = 5 and T = 5 are set by the rule of thumb

Summary

Introduction

As an important technique in data mining, clustering analysis has been used in many fields [1,2], such as information retrieval [3], social media analysis [4], privacy preserving [5], image analysis [6], text analysis [7], and bioinformatics [8]. The aim of clustering is to group those data objects with similar characteristics into the same clusters, and the ones with dissimilar characteristics into different clusters. Most existing clustering algorithms in the literature belong to one of the following two types: hierarchical and partitional. Hierarchical clustering algorithms allocate a group of data objects into a dendrogram of the nested partitions according to a divisive or agglomerative strategy [9]. While partitional clustering algorithms partition a set of data objects into a pre-defined number of clusters by optimizing an objective cost function. Center-based clustering algorithms are the most popular partitional clustering algorithms. The k-means algorithm is a widely used center-based partitional clustering algorithm due to its simplicity and high efficiency [10]. Considering the uncertainty of data objects, the fuzzy k-

A Novel ABC Based Clustering Algorithm for Categorical Data

Related Work

EðfiÞ þ ð4Þ

Experimental Results and Discussion

Conclusions and Future Work

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS ONE	Publication Date: May 20, 2015
Citations: 54	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Novel Artificial Bee Colony Based Clustering Algorithm for Categorical Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

Partition-and-merge based fuzzy genetic clustering algorithm for categorical data
Thi Phuong Quyen Nguyen ... R.J Kuo
Applied Soft Computing | VOL. 75
Thi Phuong Quyen Nguyen, et. al.Thi Phuong Quyen Nguyen ... R.J Kuo
19 Nov 2018
Applied Soft Computing | VOL. 75

Self-Expressive Kernel Subspace Clustering Algorithm for Categorical Data with Embedded Feature Selection
Hui Chen ... Qingshan Jiang
Mathematics | VOL. 9
Hui Chen, et. al.Hui Chen ... Qingshan Jiang
16 Jul 2021
Mathematics | VOL. 9

MGR: An information theory based hierarchical divisive clustering algorithm for categorical data
Hongwu Qin ... Jasni Mohamad Zain
Knowledge-Based Systems | VOL. 67
Hongwu Qin, et. al.Hongwu Qin ... Jasni Mohamad Zain
27 Mar 2014
Knowledge-Based Systems | VOL. 67

Fuzzy rough clustering for categorical data
Shuliang Xu ... Lin Feng
International Journal of Machine Learning and Cybernetics | VOL. 10
Shuliang Xu, et. al.Shuliang Xu ... Lin Feng
19 Sep 2019
International Journal of Machine Learning and Cybernetics | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Novel Artificial Bee Colony Based Clustering Algorithm for Categorical Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE