Abstract

Clustering analysis is an important and difficult task in data mining and big data analysis. Although being a widely used clustering analysis technique, variable clustering did not get enough attention in previous studies. Inspired by the metaheuristic optimization techniques developed for clustering data items, we try to overcome the main shortcoming of k-means-based variable clustering algorithm, which is being sensitive to initial centroids by introducing the metaheuristic optimization. A novel memetic algorithm named MCLPSO (Memetic Comprehensive Learning Particle Swarm Optimization) based on CLPSO (Comprehensive Learning Particle Swarm Optimization) has been studied under the framework of memetic computing in our previous work. In this work, MCLPSO is used as a metaheuristic approach to improve the k-means-based variable clustering algorithm by adjusting the initial centroids iteratively to maximize the homogeneity of the clustering results. In MCLPSO, a chaotic local search operator is used and a simulated annealing- (SA-) based local search strategy is developed by combining the cognition-only PSO model with SA. The adaptive memetic strategy can enable the stagnant particles which cannot be improved by the comprehensive learning strategy to escape from the local optima and enable some elite particles to give fine-grained local search around the promising regions. The experimental result demonstrates a good performance of MCLPSO in optimizing the variable clustering criterion on several datasets compared with the original variable clustering method. Finally, for practical use, we also developed a web-based interactive software platform for the proposed approach and give a practical case study—analyzing the performance of semiconductor manufacturing system to demonstrate the usage.

Highlights

  • Clustering analysis or clustering is the task of grouping a set of objects in such a way that, according to certain similarity, objects in the same group are more similar than objects falling in different groups

  • In our previous research [17], we proposed a novel memetic algorithm GS-MPSO and use GS-MPSO to optimize the initial centroids for k-means clustering

  • We have developed some novel memetic algorithms under the framework of memetic computation (MC) and theses memetic algorithms are applied to data clustering [17] and missing data estimation [24]

Read more

Summary

Introduction

Clustering analysis or clustering is the task of grouping a set of objects in such a way that, according to certain similarity, objects in the same group (called a cluster) are more similar than objects falling in different groups (clusters). Almost all the metaheuristic based improvements for clustering algorithms in the literature are devoted to cluster the data items, but clustering analysis for variables is a common technique. We studied the metaheuristic approach for variable clustering algorithm based on our previous work. MCLPSO [4] is studied to improve CLPSO [5] from two aspects: one is the chaotic local search and the other is the SA-based local search. MCLPSO is used to optimize k-means-based variable clustering algorithm as a metaheuristic approach. E experimental results demonstrate that MCLPSO can improve the k-means based variable clustering algorithm effectively. (iii) To facilitate the practical use of the MCLPSO-based variable clustering algorithm, we developed an interactive software system for this approach and give a real-world case study.

Related Work
Memetic Comprehensive Learning PSO
Experiment
Methods
A Web-Based Interactive Software Platform
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call