Abstract
To efficiently process time-evolving graphs where new vertices and edges are inserted over time, an incremental computing model, which processes the newly-constructed graph based on the results of the computation on the outdated graph, is widely adopted in distributed time-evolving graph computing systems. In this paper, we first experimentally study how the results of the graph computation on the local graph structure can approximate the results of the graph computation on the complete graph structure in distributed environments. Then, we develop an optimization approach to reduce the response time in bulk synchronous parallel (BSP)-based incremental computing systems by processing time-evolving graphs on the local graph structure instead of on the complete graph structure. We have evaluated our optimization approach using the graph algorithms single-source shortest path (SSSP) and PageRankon the Amazon Elastic Compute Cloud(EC2), a central part of Amazon.com’s cloud-computing platform, with different scales of graph datasets. The experimental results demonstrate that the local approximation approach can reduce the response time for the SSSP algorithm by 22% and reduce the response time for the PageRank algorithm by 7% on average compared to the existing incremental computing framework of GraphTau.
Highlights
With the rapid increase in the scales of social networks, web graphs and other biological networks, the necessity of the distribution to process large-scale graphs intensifies [1,2,3,4,5]
To further improve the performance of the incremental computing model when a tolerable loss of the quality of the results rather than knowing the final exact answers for a time-evolving graph is allowed, we present a local approximation approach, LocalAppro, to reduce the communication overhead in bulk synchronous parallel (BSP)-based time-evolving graph computing systems and thereby reduce the response time of processing time-evolving graphs in distributed environments
We first validate that the local approximation can efficiently improve the overall performance in time-evolving graph computing compared with using the global computing model in the incremental computing
Summary
With the rapid increase in the scales of social networks, web graphs and other biological networks, the necessity of the distribution to process large-scale graphs intensifies [1,2,3,4,5]. Instead of submitting a new task to re-execute the underlying algorithm on the newly-constructed snapshot, the incremental computing model is developed to process time-evolving graphs by proceeding with the computation on the newly-constructed snapshot on the basis of the results of the computation on the previous snapshot. This can guarantee the correctness of the computation on account of the observation that small changes to the previous snapshot often require only small updates to the results. The values of all the vertices of the previous snapshot should be broadcast to their out-degree neighbors in case the newly-added vertices could not receive the complete messages from the vertices of the previous snapshot to update their values
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.