On the Feasibility of Using Reduced-Precision Tensor Core Operations for Graph Analytics

Jesun Sahariar Firoz,Jiajia Li,Ang Li,Kevin Barker

doi:10.1109/hpec43674.2020.9286152

Abstract

Today's data-driven analytics and machine learning workload have been largely driven by the General-Purpose Graphics Processing Units (GPGPUs). To accelerate dense matrix multiplications on the GPUs, Tensor Core Units (TCUs) have been introduced in recent years. In this paper, we study linear-algebra-based and vertex-centric algorithms for various graph kernels on the GPUs with an objective of applying this new hardware feature to graph applications. We identify the potential stages in these graph kernels that can be executed on the Tensor Core Units. In particular, we leverage the reformulation of the reduction and scan operations in terms of matrix multiplication [1] on the TCUs. We demonstrate that executing these operations on the TCUs, available inside different graph kernels, can assist in establishing an end-to-end pipeline on the GPGPUs without depending on hand-tuned external libraries and still can deliver comparable performance for various graph analytics.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

On the Feasibility of Using Reduced-Precision Tensor Core Operations for Graph Analytics

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

In-Memory Computing in Emerging Memory Technologies for Machine Learning: An Overview
Kaushik Roy ... Amogh Agrawal
-
Kaushik Roy, et. al.Kaushik Roy ... Amogh Agrawal
01 Jul 2020
01 Jul 2020

Towards predicting GPGPU performance for concurrent workloads in Multi-GPGPU environment
Sunggon Kim ... Hyeonsang Eom
Cluster Computing | VOL. 23
Sunggon Kim, et. al.Sunggon Kim ... Hyeonsang Eom
22 Apr 2020
Cluster Computing | VOL. 23

Recovering single precision accuracy from Tensor Cores while surpassing the FP32 theoretical peak performance
Hiroyuki Ootomo ... Rio Yokota
The International Journal of High Performance Computing Applications | VOL. 36
Hiroyuki Ootomo, et. al.Hiroyuki Ootomo ... Rio Yokota
03 Jun 2022
The International Journal of High Performance Computing Applications | VOL. 36

CUDA usage in electrodynamics and mechatronics
K Mrowca
-
K MrowcaK Mrowca
01 Oct 2011
01 Oct 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On the Feasibility of Using Reduced-Precision Tensor Core Operations for Graph Analytics

Abstract

Talk to us

Similar Papers