Abstract

Ensemble clustering can improve the generalization ability of a single clustering algorithm and generate a more robust clustering result by integrating multiple base clusterings, so it becomes the focus of current clustering research. Ensemble clustering aims at finding a consensus partition which agrees as much as possible with base clusterings. Genetic algorithm is a highly parallel, stochastic, and adaptive search algorithm developed from the natural selection and evolutionary mechanism of biology. In this paper, an improved genetic algorithm is designed by improving the coding of chromosome. A new membrane evolutionary algorithm is constructed by using genetic mechanisms as evolution rules and combines with the communication mechanism of cell-like P system. The proposed algorithm is used to optimize the base clusterings and find the optimal chromosome as the final ensemble clustering result. The global optimization ability of the genetic algorithm and the rapid convergence of the membrane system make membrane evolutionary algorithm perform better than several state-of-the-art techniques on six real-world UCI data sets.

Highlights

  • Cluster analysis, known as clustering, is a core technique in machine learning and artificial intelligence [1], which is a process of dividing a data object into subsets, each subset is defined as a cluster, and objects in the same cluster are as similar as possible, yet objects between two clusters are as different as possible.Ensemble clustering, known as consensus clustering or cluster aggregation, is reconciling clustering result coming from different clustering algorithms [2] or different initialization parameters run in the same algorithm [3]

  • Known as consensus clustering or cluster aggregation, is reconciling clustering result coming from different clustering algorithms [2] or different initialization parameters run in the same algorithm [3]

  • Compared with the single clustering algorithm, the clustering ensemble algorithm has higher robustness and stability, and the clustering results are insensitive to noise, isolated points, and sampling changes, so ensemble clustering has become a hotspot of cluster research in recent years

Read more

Summary

Introduction

Known as clustering, is a core technique in machine learning and artificial intelligence [1], which is a process of dividing a data object into subsets, each subset is defined as a cluster, and objects in the same cluster are as similar as possible, yet objects between two clusters are as different as possible. P system, known as a novel membrane computing model, is a biological computational model inspired by the study of the living cells, initiated by Paun in 1998 It aims to achieve calculation process by simulating the function of living cells, tissues, and organs. Objects in this model, which has complete computing capability, can evolve in a maximal parallelism and distributed manner [20]. It is exactly because of the maximum parallelism of membrane system that realizes multiple cell object concurrent evolution to search the optimal solution, which is similar to the effect of multipopulation evolution, making better performance of ensemble clustering.

Preliminaries
Improved GA-Based Ensemble Clustering Algorithm
The Proposed GA-Based Membrane Evolutionary Algorithm
Experiment Analysis
Method
Concluding Remarks
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call