Knowledge Graph Generation From Text Using Neural Machine Translation Techniques

Athang Gupte,Sheetal Sonawane,Saumitra Sapre

doi:10.1109/iccict50803.2021.9510164

Abstract

As the applications of data science become pervasive in daily life, there arises a dire need to represent data in machine-understandable forms like knowledge graphs. Over the years, there have been numerous developments in extracting entities and their relations for augmenting knowledge graphs, but many of them depend on external dependencies like dependency parsers and part-of-speech taggers. These approaches, while indeed accomplishing this task, induce a certain degree of inflexibility in their implementation. Recent explorations in this domain have attempted to utilize Neural Machine Translation techniques to convert natural language to SPARQL queries, with a focus on information retrieval from pre-established Knowledge Graphs. We explore in detail, the variety of approaches followed for SPARQL machine translation, with a keen focus on insertion of extracted knowledge into the graphs.As part of our research, we curated a dataset- Scientists-100, extracted from Dbpedia, for the task of translation of natural language to SPARQL insertion statements. We also propose two models - an Attention RNN and a Transformer for the same. These models achieve an accuracy of 99.27% and a 98.61% respectively on the dataset. In addition to this, we present a metric for examining the syntactic accuracy of the generated SPARQL statements. Our models exhibit 99.25% and 98.71% syntactic accuracy as calculated on the same.

Full Text