Swarm Intelligence Algorithms in Text Document Clustering with Various Benchmarks.

Suganya Selvaraj,Eunmi Choi

doi:10.3390/s21093196

Abstract

Text document clustering refers to the unsupervised classification of textual documents into clusters based on content similarity and can be applied in applications such as search optimization and extracting hidden information from data generated by IoT sensors. Swarm intelligence (SI) algorithms use stochastic and heuristic principles that include simple and unintelligent individuals that follow some simple rules to accomplish very complex tasks. By mapping features of problems to parameters of SI algorithms, SI algorithms can achieve solutions in a flexible, robust, decentralized, and self-organized manner. Compared to traditional clustering algorithms, these solving mechanisms make swarm algorithms suitable for resolving complex document clustering problems. However, each SI algorithm shows a different performance based on its own strengths and weaknesses. In this paper, to find the best performing SI algorithm in text document clustering, we performed a comparative study for the PSO, bat, grey wolf optimization (GWO), and K-means algorithms using six data sets of various sizes, which were created from BBC Sport news and 20 newsgroups. Based on our experimental results, we discuss the features of a document clustering problem with the nature of SI algorithms and conclude that the PSO and GWO SI algorithms are better than K-means, and among those algorithms, the PSO performs best in terms of finding the optimal solution.

Highlights

Text document clustering is the application of cluster analysis referring to the unsupervised classification of textual documents into clusters based on content similarity
We used standard particle swarm optimization (PSO), bat algorithm (BA), and grey wolf optimization (GWO) for the text document clustering with six data sets, and the performance of these algorithms was evaluated using various metrics such as purity, homogeneity, completeness, V-measure, adjusted rand index (ARI), and average running time
The specific characteristics of each Swarm intelligence (SI) algorithm are suitable for solving specific optimization problems such as feature selection, finding an optimal route, job scheduling, role-based learning, and clustering

Summary

Introduction

Text document clustering is the application of cluster analysis referring to the unsupervised classification of textual documents into clusters based on content similarity. Text document clustering can be applied in organizing large document collections, extracting hidden information from data generated by IoT sensors, finding similar documents, detecting duplicate content, search optimization, and recommendation systems [1,2]. These text document sources may come from web pages, blog posts, news articles, or other text files [3]. Extracting relevant information from the data is a challenging task that needs fast and high-quality document clustering algorithms.

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors (Basel, Switzerland)	Publication Date: May 4, 2021
Citations: 16	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Swarm Intelligence Algorithms in Text Document Clustering with Various Benchmarks.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)

Lead the way for us

Similar Papers

Swarm Intelligence and Evolutionary Computation
Georgios Kouziokas
-
Georgios KouziokasGeorgios Kouziokas
18 Jan 2023
18 Jan 2023

A comparison analysis of swarm intelligence algorithms for robot swarm learning
...
-
, et. al. ...
03 Dec 2017
03 Dec 2017

Emerging Swarm Intelligence Algorithms and Their Applications in Antenna Design: The GWO, WOA, and SSA Optimizers
Achilles D Boursianis ... Marco Salucci
Applied Sciences | VOL. 11
Achilles D Boursianis, et. al.Achilles D Boursianis ... Marco Salucci
08 Sep 2021
Applied Sciences | VOL. 11

Research on crow swarm intelligent search optimization algorithm based on surrogate model
Huanwei Xu ... Miao Zhang
Journal of Mechanical Science and Technology | VOL. 34
Huanwei Xu, et. al.Huanwei Xu ... Miao Zhang
14 Sep 2020
Journal of Mechanical Science and Technology | VOL. 34

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Swarm Intelligence Algorithms in Text Document Clustering with Various Benchmarks.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)