Query Expansion Using Semantic Pruning in Language Model for Information Retrieval

Wei Tu,Lixin Gan,Zhihua Xie

doi:10.1007/978-3-642-33506-8_82

Abstract

A new approach is present for query expansion using semantic pruning in language model. Traditional query expansion methods usually assume independence between query terms within a query. And these methods often select expansion terms whose thematic similarity to the original query terms is above some specified threshold, thus generating a disjunctive query with much higher dimensionality. This poses two major problems 1) the potential topic dilution with overly aggressive expansion or with incorrect expansion and 2) the drastically increased execution cost of a high-dimensional query. The method developed in this paper addresses both problems by exacting the relationships between query terms within a query and mutually pruning the candidate expansion terms for such query terms. Our experiments conducted on several collections including ADI, CISI, CRAN and CACM. The results show that we can obtain significant improvements with our approach.

Full Text