Abstract
The information explosion on the Internet has placed high demands on search engines. Despite the improvements in search engine technology, the precision of current search engines is still unsatisfactory. Moreover, the queries submitted by users are short, ambiguous and imprecise. This leads to a number of problems in dealing with similar queries. The problems include lack of common keywords, selection of different documents by the search engine and lack of common clicks etc. These problems render the traditional query clustering methods unsuitable for query recommendations. In this paper, we propose a new query recommendation system. For this, we have identified conceptually related queries by capturing users’ preferences using click-through graphs of web search logs and by extracting the best features, relevant to the queries, from the snippets. The proposed system has an online feature extraction phase and an offline phase in which feature filtering and query clustering are performed. Query clustering is carried out by a new tripartite agglomerative clustering algorithm, Query-Document-Concept Clustering, in which the documents are used innovatively to decouple queries and features/concepts in a tripartite graph structure. This results in clusters of similar queries, associated clusters of documents and clusters of features. We model the query recommendation problem in four different ways. Two models are non-personalized and personalized content-ignorant models. Other two are non-personalized and personalized content-aware models. Three similarity measures are introduced to estimate different kinds of similarities. Experimental results show that the proposed approach has better precision, recall and F-measure than the existing approaches.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.