Hybrid Fruit-Fly Optimization Algorithm with K-Means for Text Document Clustering

Timea Bezdan,Miodrag Zivkovic,Ahmed Al Naamany,K Venkatachalam,Nebojsa Bacanin,Catalin Stoean,Tarik A Rashid

doi:10.3390/math9161929

Abstract

The fast-growing Internet results in massive amounts of text data. Due to the large volume of the unstructured format of text data, extracting relevant information and its analysis becomes very challenging. Text document clustering is a text-mining process that partitions the set of text-based documents into mutually exclusive clusters in such a way that documents within the same group are similar to each other, while documents from different clusters differ based on the content. One of the biggest challenges in text clustering is partitioning the collection of text data by measuring the relevance of the content in the documents. Addressing this issue, in this work a hybrid swarm intelligence algorithm with a K-means algorithm is proposed for text clustering. First, the hybrid fruit-fly optimization algorithm is tested on ten unconstrained CEC2019 benchmark functions. Next, the proposed method is evaluated on six standard benchmark text datasets. The experimental evaluation on the unconstrained functions, as well as on text-based documents, indicated that the proposed approach is robust and superior to other state-of-the-art methods.

Highlights

Text document clustering has become an important and fast-growing research area, due to the massive amounts of text data produced by the Internet, social media, email and text messages, and other sources
The proposed method is first validated on unconstrained benchmark functions, it is applied for Text Document Clustering (TDC)
The performance of the proposed method is validated on 10 modern CEC2019 functions [66] and the results are compared to the original fruit-fly optimization (FFO), and other nine metaheuristicbased approaches (EHOI, EHO, SCA, SSA, grasshopper optimization algorithm (GOA), WOA, BBO, MFO, particle swarm optimization algorithm (PSO)) [67], where the simulations were conducted under similar condition and the same problem sets are used

Summary

Introduction

Text document clustering has become an important and fast-growing research area, due to the massive amounts of text data produced by the Internet, social media, email and text messages, and other sources. One crucial method in text-mining is clustering, which has the aim of automatically partition the number of documents in a finite set of homogeneous clusters (groups). All documents are similar to each other based on the content, while in different clusters, the similarity decreases. From the perspective of optimization, clustering can be presented as an NP-hard optimization problem. Metaheuristic algorithms are shown to be very efficient to solve NP-hard optimization problems and result in close-optimal solutions in a fair amount of time. Metaheuristic algorithms that are inspired by the nature can be divided into two major categories, swarm intelligence and evolutionary algorithms. A hybrid swarm intelligence algorithm is proposed. The opposition-based learning mechanism is incorporated in the hybrid method, it is combined with the traditional K-means algorithm [4], and employed for text-based document clustering

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematics	Publication Date: Aug 13, 2021
Citations: 84	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Hybrid Fruit-Fly Optimization Algorithm with K-Means for Text Document Clustering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics

Lead the way for us

Similar Papers

Text Document Clustering Approach by Improved Sine Cosine Algorithm
Branislav Radomirović ... Nebojsa Bacanin
Information Technology and Control | VOL. 52
Branislav Radomirović, et. al.Branislav Radomirović ... Nebojsa Bacanin
15 Jul 2023
Information Technology and Control | VOL. 52

A Critical Review of K Means Text Clustering Algorithms

International Journal of Advanced Research in Computer Science | VOL. 4

01 Jan 2013
International Journal of Advanced Research in Computer Science | VOL. 4

Multi-objectives-based text clustering technique using K-mean algorithm
Laith Mohammad Abualigah ... Mohammed Azmi Al-Betar
-
Laith Mohammad Abualigah, et. al.Laith Mohammad Abualigah ... Mohammed Azmi Al-Betar
01 Jul 2016
01 Jul 2016

Feature Selection with β-Hill Climbing Search for Text Clustering Application
Laith Mohammad Abualigah ... Ahamad Tajudin Khader
-
Laith Mohammad Abualigah, et. al.Laith Mohammad Abualigah ... Ahamad Tajudin Khader
01 May 2017
01 May 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hybrid Fruit-Fly Optimization Algorithm with K-Means for Text Document Clustering

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematics