Abstract

Most existing query expansion approaches for ad-hoc retrieval adopt overly simplistic textual representations that treat documents as bags of words and ignore inherent document structure. These simple representations often lead to incorrect independence assumptions in the proposed approaches and result in limited retrieval effectiveness. In this paper, we propose a novel query expansion technique that models the various types of dependencies that exist between original query terms and expansion terms within a robust, unified framework. The proposed model is called Hierarchical Markov random fields (HMRFs), based on Latent Concept Expansion (LCE). By exploiting implicit (or explicit) hierarchical structure within documents, HMRFs can incorporate hierarchical interactions which are important for modeling term dependencies in an efficient manner. Our rigorous experimental evaluation carried out using several TREC data sets shows that our proposed query expansion technique consistently and significantly outperforms the current state-of-the-art query expansion approaches, including relevance-based language models and LCE.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call