Abstract

When ranking a list of documents relative to a given query, the vocabulary mismatches could compromise the performance, as a result of the different language used in the queries and the documents. Though the BERT-based re-ranker have significantly advanced the state-of-the-art, such mismatch still exist. Moreover, recent works demonstrated that it is non-trivial to use the established query expansion methods to boost the performance of BERT-based re-rankers. Henceforth, this paper proposes a novel query expansion model using unsupervised chunk selection, coined as BERT-QE. In particular, BERT-QE consists of three phases. After performing the first-round re-ranking in phase one, BERT-QE leverages the strength of the BERT model to select relevant text chunks from feedback documents in phase two and uses them for the final re-ranking in phase three. Furthermore, different variants of BERT-QE are thoroughly investigated for a better trade-off between effectiveness and efficiency, including the uses of smaller BERT variants and of recently proposed late interaction methods. On the standard TREC Robust04 and GOV2 test collections, the proposed BERT-QE model significantly outperforms BERT-Large models. Actually, the best variant of BERT-QE can outperform BERT-Large significantly on shallow metrics with less than 1% extra computations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call