BERT-QE: Contextualized Query Expansion for Document Re-ranking

Zhi Zheng,Kai Hui,Ben He,Xianpei Han,Andrew Yates,Le Sun

doi:10.18653/v1/2020.findings-emnlp.424

Abstract

Query expansion aims to mitigate the mismatch between the language used in a query and in a document. However, query expansion methods can suffer from introducing non-relevant information when expanding the query. To bridge this gap, inspired by recent advances in applying contextualized models like BERT to the document retrieval task, this paper proposes a novel query expansion model that leverages the strength of the BERT model to select relevant document chunks for expansion. In evaluation on the standard TREC Robust04 and GOV2 test collections, the proposed BERT-QE model significantly outperforms BERT-Large models.

Highlights

In information retrieval, the language used in a query and in a document differs in terms of verbosity, formality, and even the format
In order to reduce this gap, different query expansion methods have been proposed and have enjoyed success in improving document rankings. Such methods commonly take a pseudo relevance feedback (PRF) approach in which the query is expanded using topranked documents and the expanded query is used to rank the search results (Rocchio, 1971; Lavrenko and Croft, 2001; Amati, 2003; Metzler and Croft, 2007). Due to their reliance on pseudo relevance information, such expansion methods suffer from any non-relevant information in the feedback documents, which could pollute the query after expansion
For the proposed BERT-QE, in phase two, kd = 10 top-ranked documents from the search results of phase one are used, from which kc = 10 chunks are selected for expansion, and chunk length m = 10 is used

Summary

Introduction

The language used in a query and in a document differs in terms of verbosity, formality, and even the format (e.g., the use of keywords in a query versus the use of natural language in an article from Wikipedia). In order to reduce this gap, different query expansion methods have been proposed and have enjoyed success in improving document rankings Such methods commonly take a pseudo relevance feedback (PRF) approach in which the query is expanded using topranked documents and the expanded query is used to rank the search results (Rocchio, 1971; Lavrenko and Croft, 2001; Amati, 2003; Metzler and Croft, 2007). In the context of neural approaches, the recent neural PRF architecture (Li et al, 2018) uses feedback documents directly for expansion All these methods, are under-equipped to accurately evaluate the relevance of information pieces used for expansion. This can be caused by the mixing of relevant and non-relevant information in the expansion, like the tokens in RM3 (Lavrenko and Croft, 2001) and the documents in NPRF (Li et al, 2018); or by the facts that the models used for selecting and re-weighting the expansion information are not powerful enough, as they are essentially scalars based on counting

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

BERT-QE: Contextualized Query Expansion for Document Re-ranking

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2020
Citations: 74	License type: cc-by

Similar Papers

Contextualized query expansion via unsupervised chunk selection for text retrieval
Zhi Zheng ... Andrew Yates
Information Processing & Management | VOL. 58
Zhi Zheng, et. al.Zhi Zheng ... Andrew Yates
09 Jul 2021
Information Processing & Management | VOL. 58

A Query Expansion Method Based on Evolving Source Code
Huan Jin ... Lei Xiong
Wuhan University Journal of Natural Sciences | VOL. 24
Huan Jin, et. al.Huan Jin ... Lei Xiong
12 Sep 2019
Wuhan University Journal of Natural Sciences | VOL. 24

A Topic Transition Map for Query Expansion: A Semantic Analysis of Click-Through Data and Test Collections
Kyung-Min Kim ... Sung-Hyon Myaeng
-
Kyung-Min Kim, et. al.Kyung-Min Kim ... Sung-Hyon Myaeng
01 Jan 2015
01 Jan 2015

Multi cascaded transformer network and hybrid heuristic-aided optimal bi-clustering mechanism for patent retrieval system using query expansion
G David Raj ... Saswathi Mukherjee
Journal of Intelligent & Fuzzy Systems | VOL. -
G David Raj, et. al.G David Raj ... Saswathi Mukherjee
20 Apr 2024
Journal of Intelligent & Fuzzy Systems | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

BERT-QE: Contextualized Query Expansion for Document Re-ranking

Abstract

Highlights

Summary

Talk to us

Similar Papers