Abstract

Query reformulation techniques are essential for information retrieval systems. These techniques eliminate the bad queries (short and ambiguous queries) and minimize users search time. There are two kinds of techniques in literature; the first uses the query logs with the click-through data to identify relevant queries and propose them to the user. The second group uses relevant terms extracted from different sources such as Wikipedia, Wordnet, and pseudo-relevant documents to expand the initial query and increase the likelihood with relevant documents. Both groups generate false queries because they do not consider the user interests and due to the assumption that the user clicks only on relevant results (first group) and that the top-k retrieved documents (top 5 to 15) are relevant (second group).In this paper, we propose a novel approach to reformulate the user queries using his profile (which contains the user interests). The approach retrieves documents related to the query from four different data sources (≈ 1000 doc). Then, it extracts the topics of these documents using the Lingo clustering algorithm (cluster the document with the same topic) and Text razor API (extract the potential topics of each cluster). Finally, it generates queries based on these topics and the user profile. The results show that the proposed approach outperforms the existing solution in P@10, P@20, and MAP.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call