Harmony K-means algorithm for document clustering

Mehrdad Mahdavi,Hassan Abolhassani

doi:10.1007/s10618-008-0123-0

Abstract

Fast and high quality document clustering is a crucial task in organizing information, search engine results, enhancing web crawling, and information retrieval or filtering. Recent studies have shown that the most commonly used partition-based clustering algorithm, the K-means algorithm, is more suitable for large datasets. However, the K-means algorithm can generate a local optimal solution. In this paper we propose a novel Harmony K-means Algorithm (HKA) that deals with document clustering based on Harmony Search (HS) optimization method. It is proved by means of finite Markov chain theory that the HKA converges to the global optimum. To demonstrate the effectiveness and speed of HKA, we have applied HKA algorithms on some standard datasets. We also compare the HKA with other meta-heuristic and model-based document clustering approaches. Experimental results reveal that the HKA algorithm converges to the best known optimum faster than other methods and the quality of clusters are comparable.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Harmony K-means algorithm for document clustering

Abstract

Talk to us

Similar Papers

More From: Data Mining and Knowledge Discovery

Lead the way for us

Journal: Data Mining and Knowledge Discovery	Publication Date: Dec 11, 2008
Citations: 141

Similar Papers

Efficient stochastic algorithms for document clustering
Rana Forsati ... Mohammad Reza Meybodi
Information Sciences | VOL. 220
Rana Forsati, et. al.Rana Forsati ... Mohammad Reza Meybodi
30 Jul 2012
Information Sciences | VOL. 220

Hybridization of K-Means and Harmony Search Methods for Web Page Clustering
Rana Forsati ... Mohammadreza Meybodi
-
Rana Forsati, et. al.Rana Forsati ... Mohammadreza Meybodi
01 Dec 2008
01 Dec 2008

An improved clustering algorithm based on K-means and harmony search optimization
Lekshmy P Chandran ... K A Abdul Nazeer
-
Lekshmy P Chandran, et. al.Lekshmy P Chandran ... K A Abdul Nazeer
01 Sep 2011
01 Sep 2011

Optimization of seismic isolation systems via harmony search
Sinan Melih Nigdeli ... Cenk Alhan
Engineering Optimization | VOL. 46
Sinan Melih Nigdeli, et. al.Sinan Melih Nigdeli ... Cenk Alhan
27 Nov 2013
Engineering Optimization | VOL. 46

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Harmony K-means algorithm for document clustering

Abstract

Talk to us

Similar Papers

More From: Data Mining and Knowledge Discovery