Abstract

Community Question Answering (CQA) provides platforms for users with different backgrounds to share information and knowledge. With the increasing popularity of CQA, more and more question-answer (Q-A) pairs, with numerous duplicates, have accumulated. Therefore, many researchers focus on detecting duplicate questions in CQA. However, most existing techniques utilize only questions to solve the duplicate question detection task, while paired answers which may also contain necessary information are not considered. In this paper, we propose a BERT-encoded Hierarchical Question-Answer Cross-Attention Network for Duplicate Question Detection (Bert-QAnet) for detecting duplicate questions. Our model applies BERT to encode text and extract text features. Further, we use cross-attention to integrate word-level features both in question and answer. Also, inner attention is used to capture the interaction between question and answer. Hence, our model Bert-QAnet makes full use of semantic information in paired answers at both word-level and sentence-level. We evaluate our model on two datasets: the Yahoo! Answers dataset and the Stack Overflow dataset. To meet the special requirements of this study, both datasets are extended by paired answers. Experimental results demonstrate that our proposed model achieves state-of-the-art performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call