What is in the KGQA Benchmark Datasets? Survey on Challenges in Datasets for Question Answering on Knowledge Graphs

Nadine Steinmetz,Kai-Uwe Sattler

doi:10.1007/s13740-021-00128-9

Abstract

Question Answering based on Knowledge Graphs (KGQA) still faces difficult challenges when transforming natural language (NL) to SPARQL queries. Simple questions only referring to one triple are answerable by most QA systems, but more complex questions requiring complex queries containing subqueries or several functions are still a tough challenge within this field of research. Evaluation results of QA systems therefore also might depend on the benchmark dataset the system has been tested on. For the purpose to give an overview and reveal specific characteristics, we examined currently available KGQA datasets regarding several challenging aspects. This paper presents a detailed look into the datasets and compares them in terms of challenges a KGQA system is facing.

Highlights

Question answering (QA) aims at answering questions formulated in natural language on data sources and, combines methods from natural language processing (NLP), linguistics, database processing, and information retrieval.Though early research activities have been already conducted in the sixties, QA has received a great attention again over the last few years
Approaches based on semantic knowledge bases, such as RDF knowledge graphs—which we reference as Question Answering on Knowledge Graphs (KGQA) in the following—are a very promising idea because they can rely on large knowledge datasets such as DBpedia and simplify tasks such as mapping and disambiguation
With the questions the context is meager and a disambiguation is apparently not successful in many cases. This experiment shows that the disambiguation process should not be considered before creating the SPARQL queries during the QA pipeline

Summary

Introduction

Question answering (QA) aims at answering questions formulated in natural language on data sources and, combines methods from natural language processing (NLP), linguistics, database processing, and information retrieval. Applications that transform natural language questions to formal queries on structured data can be summarized as the class of Natural Language Interfaces to Databases (NLIDB). Further datasets have been created and published for the purpose to evaluate KGQA systems that transform NL to DBpedia-based SPARQL queries. We present in this work a comparative survey of available datasets for KGQA. The intention of this survey is two-fold:. – provide QA researchers with an overview of existing datasets, their structure and characteristics, and. These datasets are all based on DBpedia of 20162. We analyzed and compare these datasets in view of the following challenges to KGQA systems.

Related Work

LC-QuAD

SimpleDBpediaQA

Overview

Topic Definition

Analysis Description

Result Discussion

Lexical Gap

Complex Queries

Ontology Types

Answer Types

Findings

Discussion & Summary

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal on Data Semantics	Publication Date: Jun 1, 2021
Citations: 6	License type: open-access

R Discovery Prime

R Discovery Prime

What is in the KGQA Benchmark Datasets? Survey on Challenges in Datasets for Question Answering on Knowledge Graphs

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal on Data Semantics

Lead the way for us

Similar Papers

Message Passing for Complex Question Answering over Knowledge Graphs
Svitlana Vakulenko ... Michael Cochez
-
Svitlana Vakulenko, et. al.Svitlana Vakulenko ... Michael Cochez
03 Nov 2019
03 Nov 2019

Knowledge Graph Embedding Based Question Answering
Xiao Huang ... Jingyuan Zhang
-
Xiao Huang, et. al.Xiao Huang ... Jingyuan Zhang
30 Jan 2019
30 Jan 2019

Knowledge Graphs Querying
Arijit Khan
ACM SIGMOD Record | VOL. 52
Arijit KhanArijit Khan
10 Aug 2023
ACM SIGMOD Record | VOL. 52

Implicit Relation Linking for Question Answering over Knowledge Graph
...
-
, et. al. ...
11 May 2022
11 May 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

What is in the KGQA Benchmark Datasets? Survey on Challenges in Datasets for Question Answering on Knowledge Graphs

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal on Data Semantics