LCCG

Jin Zhao,Hai Jin,Xiaofei Liao,Haikun Liu,Bingsheng He,Ligang He,Yu Zhang

doi:10.1145/3458817.3480854

Abstract

In modern data centers, massive concurrent graph processing jobs are being processed on large graphs. However, existing hardware/-software solutions suffer from irregular graph traversal and intense resource contention. In this paper, we propose LCCG, a Locality-Centric programmable accelerator that augments the many-core processor for achieving higher throughput of Concurrent Graph processing jobs. Specifically, we develop a novel topology-aware execution approach into the accelerator design to regularize the graph traversals for multiple jobs on-the-fly according to the graph topology, which is able to fully consolidate the graph data accesses from concurrent jobs. By reusing the same graph data among more jobs and coalescing the accesses of the vertices' states for these jobs, LCCG can improve the core utilization. We conduct extensive experiments on a simulated 64-core processor. The results show that LCCG improves the throughput of the cutting-edge software system by 11.3~23.9 times with only 0.5% additional area cost. Moreover, LCCG gains the speedups of 4.7~10.3, 5.5~13.2, and 3.8~8.4 times over state-of-the-art hardware graph processing accelerators (namely, HATS, Minnow, and PHI, respectively).

Full Text