Distance labeling approaches are widely adopted to speed up the shortest-distance query performance. Due to the explosive growth of data graphs, a single machine can hardly satisfy the requirements of both computational power and memory capacity, which causes an urgent need for efficient distributed methods. As the graph is distributed across different machines, it is inevitable to frequently exchange messages among different machines when deploying the existing centralized distance labeling methods on the distributed environment, thereby producing serious communication costs and weakening the scalability. To alleviate this problem, we design a distributed hop-based index DH-Index , which is designed based on a newly proposed boundary graph structure and restricts the index-based hop number of each connected vertex pair within 4 hops. In addition, we propose a hierarchical algorithm to accelerate the index construction and reduce the communication cost. Furthermore, a bidirectional searching strategy is proposed to efficiently resolve the query tasks based on DH-Index. The comprehensive experimental results on eight real-world graphs demonstrate that DH-Index achieves up to 65.5× and 3 orders of magnitude speedup than the existing methods in indexing time and query performance respectively, and exhibits superior capabilities on memory space, communication cost, and scalability.
Read full abstract