HPGraph: A High Parallel Graph Processing System Based on Flash Array

Yuxuan Xing,Fang Liu,Zhengguo Chen,Nong Xiao,Ya Feng,Songping Yu

doi:10.1109/hpcc-smartcity-dss.2016.0077

Abstract

Large graphs analytics has been an important aspect of many big data applications, such as web search, social networks and recommendation systems. Many research focuses on processing large scale graphs using distributed system over past few years. And numbers of studies turn to construct graph processing system on a single server-class machine in consideration of cost, usability and maintainability. HPGraph is a high parallel graph processing system which adopts the edge-centric model, our contributions are as follows: (1) designing an efficient data allocation and access strategy for NUMA machine, and providing tasks scheduling to keep load balance, (2) raising a fine-grained edge-block filtering mechanism to avoid accessing unnecessary edge data, (3) constructing a high-speed flash array as the second storage. We made a detailed evaluation on a 16-core machine using asset of popular real word and synthetic data sets, and the results show that HPGraph always outperforms the state-of-the-art single machine graph processing systems-GridGraph. And HPGraph can achieve 1.27X faster than GridGraph for specific application. Our source code is available at https://github.com/xinghuan1990/HPGraph.

Full Text