Abstract

The text clustering based on Vector Space Model has problems, such as high-dimensional and sparse, unable to solve synonym and polyseme etc. And meanwhile, k-means clustering algorithm has shortcomings, which depends on the initial clustering center and needs to fix the number of clusters in advance. Aiming at these problems, in this paper, a text clustering algorithm based on Latent Semantic Analysis and Optimization is proposed. This algorithm can not only overcome the problems of Vector Space Model, but also can avoid the shortcomings of k-means algorithm. And compared with the text clustering algorithm based on Latent Semantic Analysis and the text clustering algorithm based on Vector Space Model and optimization, our algorithm is proved which can preferably improve the effect of text clustering, and upgrade the precision ratio and recall ration of text.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call