Abstract

Clustering, an unsupervised learning technique, aims to group patterns into clusters where similar patterns are grouped together, while dissimilar ones are placed in different clusters. This task can present itself as a complex optimization problem due to the extensive search space generated by all potential data partitions. Genetic Algorithms (GAs) have emerged as efficient tools for addressing this task. Consequently, significant advancements and numerous proposals have been developed in this field.This work offers a comprehensive and critical review of state-of-the-art mono-objective Genetic Algorithms (GAs) for partitional clustering. From a more theoretical standpoint, it examines 22 well-known proposals in detail, covering their encoding strategies, objective functions, genetic operators, local search methods, and parent selection strategies. Based on this information, a specific taxonomy is proposed. In addition, from a more practical standpoint, a detailed experimental study is conducted to discern the advantages and disadvantages of approaches. Specifically, 22 different cluster validation indices are considered to compare the performance of clustering techniques. This evaluation is performed across 94 datasets encompassing diverse configurations, including the number of classes, separation between classes, and pattern dimensionality. Results reveal interesting findings, such as the key role of local search in optimizing results and reducing search space. Additionally, representations based on centroids and labels demonstrate greater efficiency and crossover and mutation operators do not prove to be as relevant. Ultimately, while the results are satisfactory, real-world clustering problems introduce additional complexity, especially for algorithms aiming to determine the number of clusters, resulting in diminished performance and the need for new approaches to be explored. Code, datasets and instructions to run algorithms in the LEAL library are available in an associated repository, in order to facilitate future experiments in this environment.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.