Abstract
Disk-based graph systems store part or all of graph data on external devices like hard drives or SSDs, achieving scalability without excessive hardware. However, massive expensive disk I/Os remain the major performance bottleneck of disk-based graph processing. In this paper, we propose Redio, a new approach to accelerating disk-based graph processing by reducing disk I/Os. First, Redio observes that it is feasible to accommodate all vertex states in main memory and this can eliminate almost all vertex-related disk I/Os. Second, Redio introduces a dynamic selective scheduling scheme to identify inactive edges in each iteration and skip them when and only when such skipping can bring performance benefit. To improve its effectiveness, Redioin corporates a compact edge storage to improve data locality and an indexed bitmap to minimize its memory and computation overheads. We have implemented a single-node prototype for Redio under the edge-centric computation model. Extensive experiments show that Redio consistently outperforms well-known edge-centric disk-based systems in all experiments, delivering an average speedup of $4.33\times$ on HDDs and $5.33\times$ on SSDs over the fastest among them (i.e., GridGraph). Experimental results also show that Redio delivers an average speedup of $3.13\times$ on HDDs and $1.28\times$ on SSDs over the fastest among representative vertex-centric disk-based systems (i.e., FlashGraph).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.