Abstract
Finding or monitoring subgraph instances that are isomorphic to a given pattern graph in a data graph is a fundamental query operation in many graph analytic applications, such as network motif mining and fraud detection. Existing distributed methods are inefficient in communication. They have to shuffle partial matching results during the distributed multiway join. The partial matching results may be much larger than the data graph itself. To overcome the drawback, we develop the Batch-BENU framework for distributed subgraph enumeration on static data graphs. Batch-BENU executes a group of local search tasks in parallel. Each task enumerates subgraphs around a vertex in the data graph, guided by a backtracking-based execution plan. To handle large-scale data graphs that may exceed the memory capacity of a single machine, Batch-BENU stores the data graph in a distributed database. Each task queries adjacency sets of the data graph on demand, shuffling the data graph instead of partial matching results. To support incremental subgraph enumeration on dynamic data graphs, we propose the Streaming-BENU framework. Streaming-BENU turns the problem of enumerating incremental matching results into enumerating all matching results of incremental pattern graphs at each time step. We implement Batch-BENU and Streaming-BENU with the local database cache and the load balance optimization to improve their efficiency. Extensive experiments show that Batch-BENU and Streaming-BENU can scale to big graphs and complex pattern graphs. They outperform the state-of-the-art distributed methods by up to one and two orders of magnitude, respectively.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Parallel and Distributed Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.