Abstract
Providing intuitive access to Knowledge Graphs (KG) has been a prominent area of research in recent years. In particular, several Question Answering (QA) approaches have been developed to support queries over KGs using natural language. QA systems allow non-technical users to access information in KGs, thus dispensing the need to learn the graph schema and query languages. Despite the significant evolution of QA methods over the past years, challenges remain due to the differences between unstructured natural language and structured data stored in KGs, such as semantic variability. Specifically, existing QA datasets lack semantic variability for the question-query pairs to the best of our knowledge. In addition, many approaches used in QA systems require large datasets for a preprocessing or training step, but only a few specific datasets are publicly available. This paper presents ExQuestions, a question-answering dataset with multiple paraphrased questions using common-sense knowledge over knowledge graphs (KGQA). ExQuestions <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> https://zenodo.org/record/4947611#.YMeo6KhKjIU contains 128,000 question-answer pairs with questions in natural language, questions in natural language paraphrased, questions in natural language with type, SPARQL, and templates for each question. We complement the dataset to illustrate the advantage of having multiple paraphrased questions. The ExQuestions dataset is publicly available on a persistent URI for broader usage and adaptation in the research community.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.