Abstract

AbstractSubspace clustering is a data‐mining task that groups similar data objects and at the same time searches the subspaces where similarities appear. For this reason, subspace clustering is recognized as more general and complicated than standard clustering. In this article, we present ChameleoClust+, a bioinspired evolutionary subspace clustering algorithm that takes advantage of an evolvable genome structure to detect various numbers of clusters located in different subspaces. ChameleoClust+ incorporates several biolike features such as a variable genome length, both functional and nonfunctional elements, and mutation operators including large rearrangements. It was assessed and compared with the state‐of‐the‐art methods on a reference benchmark using both real‐world and synthetic data sets. Although other algorithms may need complex parameter settings, ChameleoClust+ needs to set only one subspace clustering ad hoc and intuitive parameter: the maximal number of clusters. The remaining parameters of ChameleoClust+ are related to the evolution strategy (eg, population size, mutation rate), and a single setting for all of them turned out to be effective for all the benchmark data sets. A sensitivity analysis has also been carried out to study the impact of each parameter on the subspace clustering quality.

Highlights

  • Clustering is a data mining task that aims to group objects sharing similar characteristics into sets over the whole data space

  • Several studies have shown for instance that an evolvable genome structure allows evolution to shape the effects of evolution principles themselves, phenomenon known as evolution of evolution (EvoEvo) (Hindre et al, 2012)

  • We present ChameleoClust+, an evolutionary algorithm that takes advantage of a genome having an evolvable structure to tackle the subspace clustering problem

Read more

Summary

Introduction

Clustering is a data mining task that aims to group objects sharing similar characteristics into sets (i.e., the clusters) over the whole data space. Among important phenomena in evolutionary biology, the dynamic evolution of the genome structure appears as a promising source of advances for bio-inspired optimization Important phenomena such as the variable genome length or the variable percentages of coding or functional elements within the genome are related to the evolution of genome structures phenomenon (Knibbe et al, 2007). Among the state-ofthe-art formalisms used for in silico experimental evolution reviewed in (Hindre et al, 2012), two models enable genome structure evolution: (Knibbe et al, 2007) and (Crombach and Hogeweg, 2007) Both formalisms have inspired key aspects of our work

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.