Abstract

Graph partitioning has long been seen as a viable approach to addressing Graph DBMS scalability. A partitioning, however, may introduce extra query processing latency unless it is sensitive to a specific query workload, and optimised to minimise inter-partition traversals for that workload. Additionally, it should also be possible to incrementally adjust the partitioning in reaction to changes in the graph topology, the query workload, or both. Because of their complexity, current partitioning algorithms fall short of one or both of these requirements, as they are designed for offline use and as one-off operations. The TAPER system aims to address both requirements, whilst leveraging existing partitioning algorithms. TAPER takes any given initial partitioning as a starting point, and iteratively adjusts it by swapping chosen vertices across partitions, heuristically reducing the probability of inter-partition traversals for a given path queries workload. Iterations are inexpensive thanks to time and space optimisations in the underlying support data structures. We evaluate TAPER on two different large test graphs and over realistic query workloads. Our results indicate that, given a hash-based partitioning, TAPER reduces the number of inter-partition traversals by sim 80%; given an unweighted Metis partitioning, by sim 30%. These reductions are achieved within eight iterations and with the additional advantage of being workload-aware and usable online.

Highlights

  • Path queries over labelled graphs are increasingly common in many applications

  • In this paper we present TAPER, a graph re-partitioning system that is sensitive to evolving query workloads

  • We have presented TAPER: a practical system for improving path query processing performance in partitioned graph data

Read more

Summary

Introduction

Path queries over labelled graphs are increasingly common in many applications. These include fraud detection [27], recommender systems [9] and social analysis [2] amongst others. Such a labelled graph has the form G = (V, E, L V , l), where each vertex v is annotated with a label l(v) ∈ L V from a predefined set L V of labels (e.g. Purchase, Person, etc...). In this work we address the problem of efficiently and incrementally improving path query performance over k−way partitionings of large, heterogeneous, labelled graphs.

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.