Mitigating the Impact of False Negative in Dense Retrieval with Contrastive Confidence Regularization

Shiqi Wang,Yeqin Zhang,Cam-Tu Nguyen

doi:10.1609/aaai.v38i17.29885

Abstract

In open-domain Question Answering (QA), dense text retrieval is crucial for finding relevant passages to generate answers. Typically, contrastive learning is used to train a retrieval model, which maps passages and queries to the same semantic space, making similar ones closer and dissimilar ones further apart. However, training such a system is challenging due to the false negative problem, where relevant passages may be missed during data annotation. Hard negative sampling, commonly used to improve contrastive learning, can introduce more noise in training. This is because hard negatives are those close to a given query, and thus more likely to be false negatives. To address this, we propose a novel contrastive confidence regularizer for Noise Contrastive Estimation (NCE) loss, a commonly used contrastive loss. Our analysis shows that the regularizer helps make the dense retrieval model more robust against false negatives with a theoretical guarantee. Additionally, we propose a model-agnostic method to filter out noisy negative passages in the dataset, improving any downstream dense retrieval models. Through experiments on three datasets, we demonstrate that our method achieves better retrieval performance in comparison to existing state-of-the-art dense retrieval systems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Mitigating the Impact of False Negative in Dense Retrieval with Contrastive Confidence Regularization

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Similar Papers

An Ontology-Driven Question Answering System For Computer Network Module
M I M Nowshad ... U U Samantha Rajapaksha
-
M I M Nowshad, et. al.M I M Nowshad ... U U Samantha Rajapaksha
02 Dec 2021
02 Dec 2021

Linguistic and semantic passage retrieval strategies for question answering
Matthew W Bilotti
ACM SIGIR Forum | VOL. 44
Matthew W BilottiMatthew W Bilotti
03 Jan 2011
ACM SIGIR Forum | VOL. 44

Enhancing Knowledge Acquisition Systems with User Generated and Crowdsourced Resources

-

01 Jan 2012
01 Jan 2012

Question answering systems for health professionals at the point of care-a systematic review.
Gregory Kell ... Iain J Marshall
Journal of the American Medical Informatics Association : JAMIA | VOL. 31
Gregory Kell, et. al.Gregory Kell ... Iain J Marshall
16 Feb 2024
Journal of the American Medical Informatics Association : JAMIA | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Mitigating the Impact of False Negative in Dense Retrieval with Contrastive Confidence Regularization

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence