Study of Text Clustering in Semantic Web

Liuyang Wang,Yangxin Yu

doi:10.1109/iske47853.2019.9170450

Abstract

Clustering is a Widely used information acquisition method. In order to solve the traditional text clustering Which is impossible to fully exploit the semantic information of text resources and the high dimensional and sparseness of similarity matrix, this paper proposes a text clustering method based on semantic similarity in semantic Web so as to further improve the quality of text clustering. By calculating the semantic similarity of Words so as to obtain the text semantic similarity matrix, spectral clustering is carried out according to the text semantic similarity matrix (SS-SC). The proposed method in this paper takes into account the semantic relations between Words, fully mines the potential information of the subject text, improves the quality of the clustering, and provides a new method for text clustering and recommendation. This paper verify the effect of the improved Weight calculation method on improving the clustering effectiveness. Thinking of the text resources of Google text corpus as data source, the traditional clustering K-Means algorithm, TCUSS (Text ClUstering based on Semantic Similarity) algorithm and the SS-SC algorithm are respectively tested. The results show that the precision value is higher than that of the traditional clustering algorithm.

Full Text