Weighted k-Means Algorithm Based Text Clustering

Xiuguo Chen,Pinghui Tu,Hengxi Zhang,Wensheng Yin

doi:10.1109/ieec.2009.17

Weighted k-Means Algorithm Based Text Clustering

Xiuguo Chen, Pinghui Tu + Show 2 more

https://doi.org/10.1109/ieec.2009.17

Copy DOI

Publication Date: May 1, 2009

Citations: 19

Affiliation: Huazhong University of Science and Technology, Wuhan Science and Technology Bureau

#Weighted K-means Algorithm #K-means Algorithm + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This paper proposes a weighted k-means clustering algorithm based on k-means (MacQueen, 1967; Anderberg, 1973) algorithm, and it can be used to cluster texts. Firstly, the weighted k-means algorithm changes the descriptive approach of text objects, and converts the categorical attributes to numeric ones to measure the dissimilarity of text objects by Euclidean distance; then, the weighted k-means algorithm uses weight vector to decrease the affects of irrelevant attributes and reflect the semantic information of text objects. Through an experiment, the weighted k-means algorithm is demonstrated to be more effective than k-means algorithm when used to cluster texts.

Full Text