Abstract

Providing intuitive access to Knowledge Graphs (KG) has been a prominent area of research in recent years. In particular, several Question Answering (QA) approaches have been developed to support queries over KGs using natural language. QA systems allow non-technical users to access information in KGs, thus dispensing the need to learn the graph schema and query languages. Despite the significant evolution of QA methods over the past years, challenges remain due to the differences between unstructured natural language and structured data stored in KGs, such as semantic variability. Specifically, existing QA datasets lack semantic variability for the question-query pairs to the best of our knowledge. In addition, many approaches used in QA systems require large datasets for a preprocessing or training step, but only a few specific datasets are publicly available. This paper presents ExQuestions, a question-answering dataset with multiple paraphrased questions using common-sense knowledge over knowledge graphs (KGQA). ExQuestions <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> https://zenodo.org/record/4947611#.YMeo6KhKjIU contains 128,000 question-answer pairs with questions in natural language, questions in natural language paraphrased, questions in natural language with type, SPARQL, and templates for each question. We complement the dataset to illustrate the advantage of having multiple paraphrased questions. The ExQuestions dataset is publicly available on a persistent URI for broader usage and adaptation in the research community.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call