Span prompt dense passage retrieval for Chinese open domain question answering

Chunxiao Fan,Bing Qian,Yuexin Wu,Zhen Yan

doi:10.3233/jifs-231328

Abstract

Dense passage retrieval is a popular method in information retrieval recently, especially in open domain question answering. It aims to retrieve related articles from massive passages to answer the question. Retriever can increase retrieval speed with less loss of accuracy compared to other methods. However, the pretrained language models used in recent research are often ineffective in semantic embedding, which will reduce accuracy. In addition, we find that contrastive learning will diverge the representation space, and Siamese models with independent parameters on both sides will decrease generalization performance. Therefore, we propose span prompt dense passage retrieval (SPDPR) based on span mask prompt tuning and parameter sharing in Chinese open-domain dense retrieval. This model can generate more efficient representation embeddings and effectively counteract the separation tendency between positive samples. We evaluate the effectiveness of SPDPR in DYKzh, as well as two Chinese datasets. SPDPR surpasses all SOTAs implemented in DYKzh and achieves a competitive result in other datasets.

Full Text