Abstract

Nowadays, millions of users use community question answering (CQA) systems to share valuable knowledge. An essential function of CQA systems is the accurate matching of answers w.r.t a given question. Recent research exhibits the superior advantages of graph neural networks (GNNs) on modeling content semantics for CQA matching. However, existing GNN-based approaches are insufficient to deal with the multi-modal and redundant properties of CQA systems. In this paper, we propose a multi-modal attentive graph pooling approach (MMAGP) to model the multi-modal content of questions and answers with GNNs in a unified framework, which explores the multi-modal and redundant properties of CQA systems. Our model converts each question/answer into a multi-modal content graph, which can preserve the relational information within multi-modal content. Specifically, to exploit the visual information, we propose an unsupervised meta-path link prediction approach to extract labels from visual content and model them into the multi-modal graph. An attentive graph pooling network is proposed to select vertices in the multi-modal content graph that are significant for the matching adaptively, and generate a pooled graph via aggregating context information for selected vertices. An interaction pooling network is designed to infer the final matching score based on the interactions between the pooled graphs of the input question and answer. Experimental results on two real-world datasets demonstrate the superior performance of MMAGP compared with other state-of-the-art CQA matching models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call