Abstract

Graph keyword search has drawn many research interests, since graph models can generally represent both structured and unstructured databases and keyword searches can extract valuable information for users without the knowledge of the underlying schema and query language. In practice, data graphs can be extremely large, e.g., a Web-scale graph containing billions of vertices. The state-of-the-art approaches employ centralized algorithms to process graph keyword searches, and thus they are infeasible for such large graphs, due to the limited computational power and storage space of a centralized server. To address this problem, we investigate keyword search for Web-scale graphs deployed in a distributed environment. We first give a naive search algorithm to answer the query efficiently. However, the naive search algorithm uses a flooding search strategy that incurs large time and network overhead. To remedy this shortcoming, we then propose a signature-based search algorithm. Specifically, we design a vertex signature that encodes the shortest-path distance from a vertex to any given keyword in the graph. As a result, we can find query answers by exploring fewer paths, so that the time and communication costs are low. Moreover, we reorganize the graph data in the cluster after its initial random partitioning so that the signature-based techniques are more effective. Finally, our experimental results demonstrate the feasibility of our proposed approach in performing keyword searches over Web-scale graph data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call