Abstract

The processes of retrieving useful information from a dataset are an important data mining technique that is commonly applied, known as Data Clustering. Recently, nature-inspired algorithms have been proposed and utilized for solving the optimization problems in general, and data clustering problem in particular. Black Hole (BH) optimization algorithm has been underlined as a solution for data clustering problems, in which it is a population-based metaheuristic that emulates the phenomenon of the black holes in the universe. In this instance, every solution in motion within the search space represents an individual star. The original BH has shown a superior performance when applied on a benchmark dataset, but it lacks exploration capabilities in some datasets. Addressing the exploration issue, this paper introduces the levy flight into BH algorithm to result in a novel data clustering method “Levy Flight Black Hole (LBH)”, which was then presented accordingly. In LBH, the movement of each star depends mainly on the step size generated by the Levy distribution. Therefore, the star explores an area far from the current black hole when the value step size is big, and vice versa. The performance of LBH in terms of finding the best solutions, prevent getting stuck in local optimum, and the convergence rate has been evaluated based on several unimodal and multimodal numerical optimization problems. Additionally, LBH is then tested using six real datasets available from UCI machine learning laboratory. The experimental outcomes obtained indicated the designed algorithm's suitability for data clustering, displaying effectiveness and robustness.

Highlights

  • Data clustering is a method that consists of placing similar objects together, where like items are placed in one and different items are grouped in different ones

  • In order to further verify that the proposed algorithm has a better exploration than the standard Black Hole (BH), it has been evaluated on a set of unimodal and multimodal type of benchmark test functions in a multi-dimensional space as defined in [61]–[63]

  • The comparison stage is done by benchmarking against nine well-known metaheuristics comprising of Big Bang–Big Crunch [64], Artificial Bees Colony (ABC) [65], Particle Swarm Optimization (PSO) [66], and Levy Firefly Algorithm [46] (LFFA), Grey Wolf Optimizer (GWO) [19], Gravitational search algorithm (GSA) [67], Bat algorithm (BA) [23], cat swarm algorithm (CSA) [68], and Black hole (BH) [21] respectively

Read more

Summary

Introduction

Data clustering is a method that consists of placing similar objects together, where like items are placed in one and different items are grouped in different ones. It is an unsupervised learning technique characterized by the grouping of objects in unspecified predetermined clusters. Ysis, machine learning, pattern recognition, image analysis, information retrieval, and more. This is due to clustering methods that can be categorized into various methods, such as partitional, hierarchical, density-based, grid-based, and model-based methods, [2]. Due to cluster centers being initialized, the k-means clustering algorithm is limited to the local optima [3]. Regardless, the past few decades have witnessed the development of many nature-inspired evolutionary algorithms in order to resolve engineering design optimization

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call