Abstract

Clustering is a Widely used information acquisition method. In order to solve the traditional text clustering Which is impossible to fully exploit the semantic information of text resources and the high dimensional and sparseness of similarity matrix, this paper proposes a text clustering method based on semantic similarity in semantic Web so as to further improve the quality of text clustering. By calculating the semantic similarity of Words so as to obtain the text semantic similarity matrix, spectral clustering is carried out according to the text semantic similarity matrix (SS-SC). The proposed method in this paper takes into account the semantic relations between Words, fully mines the potential information of the subject text, improves the quality of the clustering, and provides a new method for text clustering and recommendation. This paper verify the effect of the improved Weight calculation method on improving the clustering effectiveness. Thinking of the text resources of Google text corpus as data source, the traditional clustering K-Means algorithm, TCUSS (Text ClUstering based on Semantic Similarity) algorithm and the SS-SC algorithm are respectively tested. The results show that the precision value is higher than that of the traditional clustering algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call