Tree-KGQA: An Unsupervised Approach for Question Answering Over Knowledge Graphs

Md Rashad Al Hasan Rony,Debanjan Chaudhuri,Ricardo Usbeck,Jens Lehmann

doi:10.1109/access.2022.3173355

Md Rashad Al Hasan Rony, Debanjan Chaudhuri + Show 2 more

Open Access

https://doi.org/10.1109/access.2022.3173355

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2022
Citations: 5	License type: CC BY 4.0

Affiliation: University of Bonn, Universität Hamburg

Abstract

Most Knowledge Graph-based Question Answering (KGQA) systems rely on training data to reach their optimal performance. However, acquiring training data for supervised systems is both time-consuming and resource-intensive. To address this, in this paper, we propose <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Tree-KGQA</b> , an unsupervised KGQA system leveraging pre-trained language models and tree-based algorithms. Entity and relation linking are essential components of any KGQA system. We employ several pre-trained language models in the entity linking task to recognize the entities mentioned in the question and obtain the contextual representation for indexing. Furthermore, for relation linking we incorporate a pre-trained language model previously trained for language inference task. Finally, we introduce a novel algorithm for extracting the answer entities from a KG, where we construct a forest of interpretations and introduce tree-walking and tree disambiguation techniques. Our algorithm uses the linked relation and predicts the tree branches that eventually lead to the potential answer entities. The proposed method achieves 4.5% and 7.1% gains in F1 score in entity linking tasks on LC-QuAD 2.0 and LC-QuAD 2.0 (KBpearl) datasets, respectively, and a 5.4% increase in the relation linking task on LC-QuAD 2.0 (KBpearl). The comprehensive evaluations demonstrate that our unsupervised KGQA approach outperforms other supervised state-of-the-art methods on the WebQSP-WD test set (1.4% increase in F1 score) - without training on the target dataset.

Full Text