A Novel Consensus Fuzzy K-Modes Clustering Using Coupling DNA-Chain-Hypergraph P System for Categorical Data

Zhenni Jiang,Xiyu Liu

doi:10.3390/pr8101326

Zhenni Jiang, Xiyu Liu

Open Access

PDF Available

https://doi.org/10.3390/pr8101326

Copy DOI

Export

Save

Cite

Journal: Processes	Publication Date: Oct 21, 2020
Citations: 5	License type: CC BY 4.0

Affiliation: Shandong Normal University

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

In this paper, a data clustering method named consensus fuzzy k-modes clustering is proposed to improve the performance of the clustering for the categorical data. At the same time, the coupling DNA-chain-hypergraph P system is constructed to realize the process of the clustering. This P system can prevent the clustering algorithm falling into the local optimum and realize the clustering process in implicit parallelism. The consensus fuzzy k-modes algorithm can combine the advantages of the fuzzy k-modes algorithm, weight fuzzy k-modes algorithm and genetic fuzzy k-modes algorithm. The fuzzy k-modes algorithm can realize the soft partition which is closer to reality, but treats all the variables equally. The weight fuzzy k-modes algorithm introduced the weight vector which strengthens the basic k-modes clustering by associating higher weights with features useful in analysis. These two methods are only improvements the k-modes algorithm itself. So, the genetic k-modes algorithm is proposed which used the genetic operations in the clustering process. In this paper, we examine these three kinds of k-modes algorithms and further introduce DNA genetic optimization operations in the final consensus process. Finally, we conduct experiments on the seven UCI datasets and compare the clustering results with another four categorical clustering algorithms. The experiment results and statistical test results show that our method can get better clustering results than the compared clustering algorithms, respectively.

Highlights

Data clustering has recently attracted more attentions in practical applications
At the same time, considering the structure of the DCHP system and the characteristics of the three basic clustering algorithms, we need to guarantee that the basic partitions (BPs) generated by the different algorithms are the same
We propose a novel P system (DCHP) with a hybrid structure which combines the advantage of the chain structure and hypergraph topology structure for the consensus fuzzy k-modes clustering

Summary

Introduction

Data clustering has recently attracted more attentions in practical applications. In this method, the distance between the clustering center and the data objects are calculated by the standard distance metrics. There are many classification datasets that do not have a natural order or distance between the parts. In the real world, each classification attribute of blood type has a unique classification value, such as [A, B, O or AB]. Research into categorical data is a difficult and challenging task, which attracts many data mining researchers

Methods

Results

Conclusion