Large-scale Search Logs Research Articles

Result ranking is one of the major concerns for Web search technologies. Most existing methodologies rank search results in descending order of relevance. To model the interactions among search results, reinforcement learning (RL algorithms have been widely adopted for ranking tasks. However, the online training of RL methods is time and resource consuming at scale. As an alternative, learning ranking policies in the simulation environment is much more feasible and efficient. In this article, we propose two different simulation environments for the offline training of the RL ranking agent: the Context-aware Click Simulator (CCS) and the Fine-grained User Behavior Simulator with GAN (UserGAN). Based on the simulation environment, we also design a User Behavior Simulation for Reinforcement Learning (UBS4RL) re-ranking framework, which consists of three modules: a feature extractor for heterogeneous search results, a user simulator for collecting simulated user feedback, and a ranking agent for generation of optimized result lists. Extensive experiments on both simulated and practical Web search datasets show that (1) the proposed user simulators can capture and simulate fine-grained user behavior patterns by training on large-scale search logs, (2) the temporal information of user searching process is a strong signal for ranking evaluation, and (3) learning ranking policies from the simulation environment can effectively improve the search ranking performance.

Query suggestion plays an important role in improving usability of search engines. Although some recently proposed methods provide query suggestions by mining query patterns from search logs, none of them models the immediately preceding queries as context systematically, and uses context information effectively in query suggestions. Context-aware query suggestion is challenging in both modeling context and scaling up query suggestion using context. In this article, we propose a novel context-aware query suggestion approach. To tackle the challenges, our approach consists of two stages. In the first, offline model-learning stage , to address data sparseness, queries are summarized into concepts by clustering a click-through bipartite. A concept sequence suffix tree is then constructed from session data as a context-aware query suggestion model. In the second, online query suggestion stage , a user’s search context is captured by mapping the query sequence submitted by the user to a sequence of concepts. By looking up the context in the concept sequence suffix tree, we suggest to the user context-aware queries. We test our approach on large-scale search logs of a commercial search engine containing 4.0 billion Web queries, 5.9 billion clicks, and 1.87 billion search sessions. The experimental results clearly show that our approach outperforms three baseline methods in both coverage and quality of suggestions.

Large-scale Search Logs Research Articles

Related Topics

Articles published on Large-scale Search Logs

User Behavior Simulation for Search Result Re-ranking

Enhancing web search with queries of equivalent intents

Mining Concept Sequences from Large-Scale Search Logs for Context-Aware Query Suggestion

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Large-scale Search Logs Research Articles

Related Topics

Articles published on Large-scale Search Logs

User Behavior Simulation for Search Result Re-ranking

Enhancing web search with queries of equivalent intents

Mining Concept Sequences from Large-Scale Search Logs for Context-Aware Query Suggestion