TriCore: Parallel Triangle Counting on GPUs

Yang Hu,Hang Liu,H Howie Huang

doi:10.1109/sc.2018.00017

Abstract

Exact triangle counting algorithm enumerates the triangles in a graph by identifying the common neighbors of two vertices of each edge. In this work, we present TriCore, a scalable GPU-based triangle counting system that consists of three major techniques. First, we design a binary search based algorithm that can increase both the thread parallelism and memory performance on Graphics Processing Units (GPUs), both of which are absent from prior work. Second, in contrast to prior attempts which require multiple graph representations, i.e., compressed sparse row (CSR), edge list, and bitmap, to be present in the GPU memory, TriCore evenly partitions and distributes the partitioned CSR data across all the GPUs, and uses a streaming buffer to load the edge list from the CPU memory on the fly. This design enables TriCore to process the graphs that are orders of magnitude larger than the GPU memory. Third, we further develop a dynamic workload management technique to balance the workload across GPUs. our evaluation demonstrates that TriCore on a single GPU can count the triangles in the billion-edge Twitter graph within 24 seconds, that is, 22× faster than the state-of-the-art CPU project which uses CPUs that are 8× more expensive. When processing big graphs (up to 33.4 billion edges) that are ∼22× larger than the memory size of a single GPU, it achieves 24× speedup when scaling from 1 to 32 GPUs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

TriCore: Parallel Triangle Counting on GPUs

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Ballooning Graphics Memory Space in Full GPU Virtualization Environments
Younghun Park ... Sungyong Park
Scientific Programming | VOL. 2019
Younghun Park, et. al.Younghun Park ... Sungyong Park
23 Apr 2019
Scientific Programming | VOL. 2019

Acceleration of Large Deep Learning Training with Hybrid GPU Memory Management of Swapping and Re-computing
Haruki Imai ... Yasushi Negishi
-
Haruki Imai, et. al.Haruki Imai ... Yasushi Negishi
10 Dec 2020
10 Dec 2020

A Compression Method for Storage Formats of a Sparse Matrix in Solving the Large-Scale Linear Systems
Tomoki Kawamura ... Yoneda Kazunori
-
Tomoki Kawamura, et. al.Tomoki Kawamura ... Yoneda Kazunori
01 May 2017
01 May 2017

Reduction of computing time for seismic applications based on the Helmholtz equation by Graphics Processing Units

-

03 Mar 2015
03 Mar 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

TriCore: Parallel Triangle Counting on GPUs

Abstract

Talk to us

Similar Papers