Contextualized query expansion via unsupervised chunk selection for text retrieval

Zhi Zheng,Kai Hui,Ben He,Xianpei Han,Le Sun,Andrew Yates

doi:10.1016/j.ipm.2021.102672

Zhi Zheng, Kai Hui + Show 4 more

https://doi.org/10.1016/j.ipm.2021.102672

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

When ranking a list of documents relative to a given query, the vocabulary mismatches could compromise the performance, as a result of the different language used in the queries and the documents. Though the BERT-based re-ranker have significantly advanced the state-of-the-art, such mismatch still exist. Moreover, recent works demonstrated that it is non-trivial to use the established query expansion methods to boost the performance of BERT-based re-rankers. Henceforth, this paper proposes a novel query expansion model using unsupervised chunk selection, coined as BERT-QE. In particular, BERT-QE consists of three phases. After performing the first-round re-ranking in phase one, BERT-QE leverages the strength of the BERT model to select relevant text chunks from feedback documents in phase two and uses them for the final re-ranking in phase three. Furthermore, different variants of BERT-QE are thoroughly investigated for a better trade-off between effectiveness and efficiency, including the uses of smaller BERT variants and of recently proposed late interaction methods. On the standard TREC Robust04 and GOV2 test collections, the proposed BERT-QE model significantly outperforms BERT-Large models. Actually, the best variant of BERT-QE can outperform BERT-Large significantly on shallow metrics with less than 1% extra computations.

Full Text