Abstract

Previously, cluster-based multi or many objective function techniques were proposed to reduce the Pareto set. Recently, researchers proposed such techniques to find better solutions in the objective space to solve engineering problems. In this work, we applied a cluster-based approach for solution selection in a multiobjective evolutionary algorithm based on decomposition with bare bones particle swarm optimization for data clustering and investigated its clustering performance. In our previous work, we found that MOEA/D with BBPSO performed the best on 10 datasets. Here, we extend this work applying a cluster-based approach tested on 13 UCI datasets. We compared with six multiobjective evolutionary clustering algorithms from the existing literature and ten from our previous work. The proposed technique was found to perform well on datasets highly overlapping clusters, such as CMC and Sonar. So far, we found only one work that used cluster-based MOEA for clustering data, the hierarchical topology multiobjective clustering algorithm. All other cluster-based MOEA found were used to solve other problems that are not data clustering problems. By clustering Pareto solutions and evaluating new candidates against the found cluster representatives, local search is introduced in the solution selection process within the objective space, which can be effective on datasets with highly overlapping clusters. This is an added layer of search control in the objective space. The results are found to be promising, prompting different areas of future research which are discussed, including the study of its effects with an increasing number of clusters as well as with other objective functions.

Highlights

  • Clustering is widely used to find hidden structures in data

  • We developed a novel framework that requires little to no parameter setting consisting of a simple swarm intelligence technique BBPSO and multiobjective optimization (MOO) technique multiobjective evolutionary algorithms (MOEA)/D with a solution update based on K-means for data clustering

  • BBPSO Comparisons In Table 1, we compare the performance of CM-BBPSO with MOEA/D BBPSO and BBPSO evaluated on 13 UCI datasets and using 5 metrics: accuracy, F1-score (F1), Kappa (Cohen’s κ index), between sum of squares (BSS), within sum of square (WSS) and quantization error (QE)

Read more

Summary

Introduction

Clustering is widely used to find hidden structures in data. In clustering, a set of C cluster centers v = {v1 , . . . , vC } represents prototypes of clusters. Each cluster contain similar objects in a dataset Z = {z1 , . The goal of clustering is to learn the partition matrix. The partition matrix shows that an object z j belongs to the cluster Ci and is represented by a C × N matrix as U = [uij ] where i = 1, . In hard clustering, such as k-means, uij = 1 if z j ∈ Ci and 0, otherwise

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.