Efficient keyword search on graph data for finding diverse and relevant answers

Chang-Sup Park

doi:10.1108/ijwis-09-2022-0157

Abstract

PurposeThis paper studies a keyword search over graph-structured data used in various fields such as semantic web, linked open data and social networks. This study aims to propose an efficient keyword search algorithm on graph data to find top-k answers that are most relevant to the query and have diverse content nodes for the input keywords.Design/methodology/approachBased on an aggregative measure of diversity of an answer set, this study proposes an approach to searching the top-k diverse answers to a query on graph data, which finds a set of most relevant answer trees whose average dissimilarity should be no lower than a given threshold. This study defines a diversity constraint that must be satisfied for a subset of answer trees to be included in the solution. Then, an enumeration algorithm and a heuristic search algorithm are proposed to find an optimal solution efficiently based on the diversity constraint and an A* heuristic. This study also provides strategies for improving the performance of the heuristic search method.FindingsThe results of experiments using a real data set demonstrate that the proposed search algorithm can find top-k diverse and relevant answers to a query on large-scale graph data efficiently and outperforms the previous methods.Originality/valueThis study proposes a new keyword search method for graph data that finds an optimal solution with diverse and relevant answers to the query. It can provide users with query results that satisfy their various information needs on large graph data.

Full Text