Abstract

In this digital era, millions of Internet users are contributing vast amounts of data in the form of unstructured text documents. Organizing this material is a tedious task. The clustering of text document plays a vital role for organizing these unstructured text documents. In our paper, we make use of Hybrid Jaya Optimization algorithm (HJO) for text Document Clustering (DC), referred to as HJO-DC. We have used the Silhouette index as a metric to measure the quality of a solution. The proposed work is compared with partitioning techniques such as K-Means and K-Medoids and metaheuristic techniques such as Genetic algorithm, Cuckoo Search, Particle Swarm Optimizer, Firefly and Grey Wolf Optimizer. Remarkably, the proposed algorithm achieves the highest quality clustering in all benchmark examples.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call