Abstract

Fast reachability detection is one of the key problems in graph applications. Most of the existing works focus on creating an index and answering reachability based on that index. For these approaches, the index construction time and index size can become a concern for large graphs. More recently query-preserving graph compression has been proposed, and searching reachability over the compressed graph has been shown to be able to significantly improve query performance as well as reducing the index size. In this paper, we introduce a multilevel compression scheme for DAGs, which builds on existing compression schemes, but can further reduce the graph size for many real-world graphs. We propose an algorithm to answer reachability queries using the compressed graph. Extensive experiments with four existing state-of-the-art reachability algorithms and 12 real-world datasets demonstrate that our approach outperforms the existing methods. Experiments with synthetic datasets ensure the scalability of this approach. We also provide a discussion on possible compression for k-reachability.

Highlights

  • The reachability query, which asks whether there exists a path from one vertex to another in a directed graph, finds numerous applications in graph and network analysis

  • The resulting graph G after transitive reduction and equivalence reduction over the original graph G can be a much smaller graph that retains all reachability information, and it was experimentally verified that for many real-world graphs, searching for reachability over G can be much faster than searching over G using state-of-the-art algorithms

  • We show how to use the decomposition tree to answer reachability queries over the original graph efficiently

Read more

Summary

Introduction

The reachability query, which asks whether there exists a path from one vertex to another in a directed graph, finds numerous applications in graph and network analysis Such queries can be answered by graph traversal using either breadth-first or depth-first search in time O(|E| + |V|) without preprocessing (where V and E are the vertex set and edge set, respectively), or in constant time if we pre-compute and store the transitive closure of each vertex, which takes O(|V||E|) time and O(|V|2) space. We organize the modules into a hierarchical structure called modular decomposition tree, and propose an efficient algorithm to utilize the tree to answer reachability queries.

Index‐Based Approach
Compression‐Based Approach
Preliminaries
Redundant Edges
Equivalence Class
Modular Decomposition
Overview of Our Approach v4 v5
Multilevel Compression and Modular Decomposition Tree
Answering Reachability Queries Using Modular Decomposition Tree
Building Modular Decomposition Tree
Complexity
Finding Reachability Using the Decomposition Tree
Size of the Decomposition Tree
Experiments
Implementation and Running Environment
Datasets and Queries
Datasets
Compression Ratio
Index Construction Time
Index Size
Query Performance
Experiments on Synthetic Datasets
Query Time
Discussion on k‐Hop Reachability
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call