Abstract
Clustering web search results is a kind of solution which help user to find the interested topic by grouping the search results. This paper presents an improved method for clustering search results focused on Chinese web pages. The main contributions of this paper are the following: First, in this paper, a method which identifies the complete semantic information phrase by comparing the attributes of base clusters in the suffix tree document model and the overlap of their document sets is presented. Second, by analyzing the content and structure of title and snippet of Chinese web search results, one way of sentence segmentation is designed and implemented to constructing suffix tree. Third, In order to better respond to the associate degree of terms, a novel method is proposed which compute the distance in sentence-grain of terms' co-occurrences. Finally, the experiment illustrates that the new clustering method provides an efficient and effective way for user browsing and locating sought information.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.