HCA: Hierarchical Compare Aggregate model for question retrieval in community question answering

Mohammad Sadegh Zahedi,Maseud Rahgozar,Reza Aghaeizadeh Zoroofi

doi:10.1016/j.ipm.2020.102318

Mohammad Sadegh Zahedi, Maseud Rahgozar + Show 1 more

https://doi.org/10.1016/j.ipm.2020.102318

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

We address the problem of finding similar historical questions that are semantically equivalent or relevant to an input query question in community question-answering (CQA) sites. One of the main challenges for this task is that questions are usually too long and often contain peripheral information in addition to the main goals of the question. To address this problem, we propose an end-to-end Hierarchical Compare Aggregate (HCA) model that can handle this problem without using any task-specific features. We first split questions into sentences and compare every sentence pair of the two questions using a proposed Word-Level-Compare-Aggregate model called WLCA-model and then the comparison results are aggregated with a proposed Sentence-Level-Compare-Aggregate model to make the final decision. To handle the insufficient training data problem, we propose a sequential transfer learning approach to pre-train the WLCA-model on a large paraphrase detection dataset. Our experiments on two editions of the Semeval benchmark datasets and the domain-specific AskUbuntu dataset show that our model outperforms the state-of-the-art models.

Full Text