SGPT: A Generative Approach for SPARQL Query Generation From Natural Language Questions

Md Rashad Al Hasan Rony,Uttam Kumar,Roman Teucher,Jens Lehmann,Liubov Kovriguina

doi:10.1109/access.2022.3188714

Md Rashad Al Hasan Rony, Uttam Kumar + Show 3 more

Open Access

https://doi.org/10.1109/access.2022.3188714

Copy DOI

Abstract

SPARQL query generation from natural language questions is complex because it requires an understanding of both the question and the underlying knowledge graph (KG) patterns. Most SPARQL query generation approaches are template-based, tailored to a specific knowledge graph and require pipelines with multiple steps, including entity and relation linking. Template-based approaches are also difficult to adapt for new KGs and require manual efforts from domain experts to construct query templates. To overcome this hurdle, we propose a new approach, dubbed SGPT, that combines the benefits of end-to-end and modular systems and leverages recent advances in large-scale language models. Specifically, we devise a novel embedding technique that can encode linguistic features from the question which enables the system to learn complex question patterns. In addition, we propose training techniques that allow the system to implicitly employ the graph-specific information (i.e., entities and relations) into the language model’s parameters and generate SPARQL queries accurately. Finally, we introduce a strategy to adapt standard automatic metrics for evaluating SPARQL query generation. A comprehensive evaluation demonstrates the effectiveness of SGPT over state-of-the-art methods across several benchmark datasets.

Full Text