Data Clustering Using Moth-Flame Optimization Algorithm.

Tribhuvan Singh,Mohamed Abdalla,Nitin Saxena,Hammam Alshazly,Manju Khurana,Dilbag Singh

doi:10.3390/s21124086

Abstract

A k-means algorithm is a method for clustering that has already gained a wide range of acceptability. However, its performance extremely depends on the opening cluster centers. Besides, due to weak exploration capability, it is easily stuck at local optima. Recently, a new metaheuristic called Moth Flame Optimizer (MFO) is proposed to handle complex problems. MFO simulates the moths intelligence, known as transverse orientation, used to navigate in nature. In various research work, the performance of MFO is found quite satisfactory. This paper suggests a novel heuristic approach based on the MFO to solve data clustering problems. To validate the competitiveness of the proposed approach, various experiments have been conducted using Shape and UCI benchmark datasets. The proposed approach is compared with five state-of-art algorithms over twelve datasets. The mean performance of the proposed algorithm is superior on 10 datasets and comparable in remaining two datasets. The analysis of experimental results confirms the efficacy of the suggested approach.

Highlights

K-means algorithm is simple and efficient but the accuracy of its result highly dependent on initially selected cluster centers, prone to trap in local optima solution
The parameters of BHA, Multi-Verse Optimizer (MVO), Harris Hawks Optimizer (HHO), Grey Wolf Optimizer (GWO), and k-means algorithms were set according to their corresponding references [19,44,45,46,47], respectively
The k-means algorithm observed the worst performance in all benchmark datasets

Summary

Introduction

Data clustering methods are being widely implemented in various real-world applications such as data mining [1], machine learning [2], information retrieval [3,4], pattern recognition [5,6,7], face clustering and recognition [8,9,10], wireless sensor networks [11], etc The objective of this method is to partition data objects in such a way to minimize accumulated distances between data objects and their respective centroids. A number of algorithms have been proposed to solve the data clustering problem. Data clustering is one of the NP-hard problems and difficult to solve using deterministic algorithms. Being an NP-hard problem, deterministic approaches cause local entrapment which in turn, affects the overall performance of the algorithm

Methods

Results

Discussion

Conclusion