Abstract

With the drastically increasing size of graph data with more diversified and complex structures, it becomes more challenging to summarize and query large attributed graph data. In this paper, we propose a holistic approach for distributed aggregation-based attributed graph summarization for large-scale approximate attributed graph queries, which incorporates node attributes and relationships into topological structure for generating semantic understandable graph summary in a bottom-up way. First, we propose a holistic strategy of node aggregation to calculate the topological and attributed error increments of merging node pairs. Second, we propose a three-stage distributed implementation framework, where a novel heuristic measure for efficient parallelization is presented to reduce computation and communication costs across multiple machines. Third, a summary-based approximate graph query approach is introduced to accelerate graph query while maintaining high query accuracy. At last, extensive experiments were made over three real-world and synthetic attributed graphs. The results show that our approach has competitive performance in maintaining low error increment and computational costs in comparison with the state-of-the-art aggregation-based graph summarization approach, and that our summary-based approximate graph query can accelerate graph query while maintaining high query accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call