Knowledge graph question answering (KGQA) systems have an important role in retrieving data from a knowledge graph (KG). With the system, regular users can access data from a KG without the need to construct a formal SPARQL query. KGQA systems receive a natural language question (NLQ) and translate it into a SPARQL query through three main tasks, namely, entity and relation detection, entity and relation linking, and query construction. However, the translation is not trivial due to lexical gaps and entity ambiguity that may occur during entity or relation linking. This research proposed an approach based on multiclass classification of NLQ whose entity occurrences are detected into categories based on KG relations to address the lexical gap challenge. Next, to solve the entity ambiguity challenge, this research proposed a three-stage searching procedure to determine appropriate KG entities associated with the NLQ entities, given the correspondence between the NLQ and a particular KG relation. This three-stage searching consisted of text-based searching, vector-based searching, and entity and relation pairing. The proposed approach was evaluated on the SimpleQuestions and LC-QuAD 2.0 datasets. The experiments demonstrated that the proposed approach outperformed the state-of-the-art baseline. For the relation linking task, the proposed approach reached 89.87% and 74.83% recall for the SimpleQuestions and LC-QuAD 2.0, respectively. This approach also achieved 91.74% and 61.96% recall on the entity linking tasks for the SimpleQuestions and LC-QuAD 2.0, respectively.
Read full abstract