Abstract

In order to understand user intents behind their queries, many researchers study similar query finding. Recently, the click graph has shown its utility in describing the relationship between queries and URLs. The previous approaches mainly either generate related terms or find relevant queries based on the co-clicked URLs. However, these approaches may suffer from the complexity of natural language processing and click-through data sparseness. In this paper, we tackle this problem through three query probability distribution representation models: Click Model, Term Model, and Semantic Model. The Click Model extracts credible transition probability from queries to URLs, and describes a query without considering web contents. The Term Model focuses on representing a query via term distribution over its main entities and purposes, which can better capture information needs behind short and ambiguous keyword queries. The Semantic Model learns potential intent distribution of queries to distinguish user intents behind a query. Among the three models, we apply pairwise similarity metrics and graph-based personalized pagerank to find similar queries. Compared to traditional representation models, our representation models are verified to be effective and efficient, especially for long tail queries.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call