Multitask Fine-Tuning for Passage Re-Ranking Using BM25 and Pseudo Relevance Feedback

Meoungjun Kim,Youngjoong Ko

doi:10.1109/access.2022.3176894

Abstract

Passage re-ranking is a machine learning task that estimates relevance scores between a given query and candidate passages. Keyword features based on the lexical similarities between queries and passages have been traditionally used for the passage re-ranking models. However, such approaches have a limitation; it is difficult to find semantic and contextual features beyond word-matching information. Recently, several studies based on neural pre-trained language models such as BERT overcome the limitations of traditional keyword-based models and they show significant performance improvements. Such ranking models have the advantage of finding the contextual features of queries and documents better than traditional keyword-based methods. However, these deep learning-based models require large amounts of data for training. Such training data is usually manually labeled with high cost, and how to utilize the data efficiently is an important issue. This paper proposes a fine-tuning method for efficient training of the neural re-ranking model. The proposed model utilizes data augmentation by simultaneously learning the ranking and MLM tasks during the fine-tuning stages. For the MLM task, different parts of a passage are masked at each training epoch. Even if only one pair of query and passage is given, the model is exposed to diverse cases with passages dynamically masked from the one. Also, the probability distribution of term importance is trained on the model.We calculate term importance weight by two novel methods using BM25 and pseudo relevance feedback. Terms are sampled and masked according to the importance weight. The ranking model learns representation based on the term weight distribution by executing the MLM task. A novel method with pseudo relevance feedback is applied for calculating term importance. It enables the neural ranking models to form representation according to feedbacks from an initial search stage. The proposed model is trained with data from the MS MARCO leaderboard for the re-ranking task. Our model achieves the state-of-the-art MRR@10 score in the leaderboard except for the ensemble-based method. In addition, our model demonstrates significant performance in three different evaluation metrics: MRR@10, Mean Rank, and Hit@(5,10,20,50).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2022
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Multitask Fine-Tuning for Passage Re-Ranking Using BM25 and Pseudo Relevance Feedback

Abstract

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

End-to-End Contextualized Document Indexing and Retrieval with Neural Networks
Sebastian Hofstätter
-
Sebastian HofstätterSebastian Hofstätter
25 Jul 2020
25 Jul 2020

Pseudo Relevance Feedback with Deep Language Models and Dense Retrievers: Successes and Pitfalls
Hang Li ... Shengyao Zhuang
ACM Transactions on Information Systems | VOL. 41
Hang Li, et. al.Hang Li ... Shengyao Zhuang
10 Apr 2023
ACM Transactions on Information Systems | VOL. 41

Learning to Augment Imbalanced Data for Re-ranking Models
Zi-Hao Qiu ... Qing-Guo Chen
-
Zi-Hao Qiu, et. al.Zi-Hao Qiu ... Qing-Guo Chen
26 Oct 2021
26 Oct 2021

Attention Over Self-Attention: Intention-Aware Re-Ranking With Dynamic Transformer Encoders for Recommendation
Zhuoyi Lin ... Rundong Wang
IEEE Transactions on Knowledge and Data Engineering | VOL. -
Zhuoyi Lin, et. al.Zhuoyi Lin ... Rundong Wang
01 Jan 2021
IEEE Transactions on Knowledge and Data Engineering | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multitask Fine-Tuning for Passage Re-Ranking Using BM25 and Pseudo Relevance Feedback

Abstract

Talk to us

Similar Papers

More From: IEEE Access