Abstract

In this research, we propose two variants of the Firefly Algorithm (FA), namely inward intensified exploration FA (IIEFA) and compound intensified exploration FA (CIEFA), for undertaking the obstinate problems of initialization sensitivity and local optima traps of the K-means clustering model. To enhance the capability of both exploitation and exploration, matrix-based search parameters and dispersing mechanisms are incorporated into the two proposed FA models. We first replace the attractiveness coefficient with a randomized control matrix in the IIEFA model to release the FA from the constraints of biological law, as the exploitation capability in the neighbourhood is elevated from a one-dimensional to multi-dimensional search mechanism with enhanced diversity in search scopes, scales, and directions. Besides that, we employ a dispersing mechanism in the second CIEFA model to dispatch fireflies with high similarities to new positions out of the close neighbourhood to perform global exploration. This dispersing mechanism ensures sufficient variance between fireflies in comparison to increase search efficiency. The ALL-IDB2 database, a skin lesion data set, and a total of 15 UCI data sets are employed to evaluate efficiency of the proposed FA models on clustering tasks. The minimum Redundancy Maximum Relevance (mRMR)-based feature selection method is also adopted to reduce feature dimensionality. The empirical results indicate that the proposed FA models demonstrate statistically significant superiority in both distance and performance measures for clustering tasks in comparison with conventional K-means clustering, five classical search methods, and five advanced FA variants.

Highlights

  • Clustering analysis is one of the fundamental methods of discovering and understanding underlying patterns embodied in data by partitioning data objects into several clusters according to measured or perceived intrinsic characteristics or similarity [1]

  • After examining the proposed intensified exploration FA (IIEFA) and compound intensified exploration FA (CIEFA) models both theoretically and experimentally in the above section, we further extend our evaluation to more challenging clustering tasks with both high dimensionalities and complex cluster distributions, as an attempt to ascertain the performance of the proposed methods more comprehensively

  • We have proposed two Firefly Algorithm (FA) variants, namely IIEFA and CIEFA, to undertake the problems associated with initialization sensitivity and local optima traps of the conventional KM clustering algorithm

Read more

Summary

Introduction

Clustering analysis is one of the fundamental methods of discovering and understanding underlying patterns embodied in data by partitioning data objects into several clusters according to measured or perceived intrinsic characteristics or similarity [1]. As a result of the clustering process, data samples with high similarity are grouped in the same cluster, while those with distinctions are categorized into different clusters. Conventional clustering algorithms can be broadly categorized into two groups: partitioning and hierarchical methods. The partitioning methods divide data samples into several clusters simultaneously, where each instance can only exclusively belong to one specific cluster. The hierarchical methods build a hierarchy of clusters, either in an agglomerative or divisive mode. K-means (KM) clustering is one of the popular partitioning methods, and is widely used owing to its simplicity, efficiency, and ease of implementation [1]

Objectives
Methods
Findings
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call