Abstract

Dense passage retrieval is a popular method in information retrieval recently, especially in open domain question answering. It aims to retrieve related articles from massive passages to answer the question. Retriever can increase retrieval speed with less loss of accuracy compared to other methods. However, the pretrained language models used in recent research are often ineffective in semantic embedding, which will reduce accuracy. In addition, we find that contrastive learning will diverge the representation space, and Siamese models with independent parameters on both sides will decrease generalization performance. Therefore, we propose span prompt dense passage retrieval (SPDPR) based on span mask prompt tuning and parameter sharing in Chinese open-domain dense retrieval. This model can generate more efficient representation embeddings and effectively counteract the separation tendency between positive samples. We evaluate the effectiveness of SPDPR in DYKzh, as well as two Chinese datasets. SPDPR surpasses all SOTAs implemented in DYKzh and achieves a competitive result in other datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call