Software/Hardware Co-design of 3D NoC-based GPU Architectures for Accelerated Graph Computations

Dwaipayan Choudhury,Aravind Sukumaran Rajam,Partha Pratim Pande,Reet Barik,Ananth Kalyanaraman

doi:10.1145/3514354

Abstract

Manycore GPU architectures have become the mainstay for accelerating graph computations. One of the primary bottlenecks to performance of graph computations on manycore architectures is the data movement. Since most of the accesses in graph processing are due to vertex neighborhood lookups, locality in graph data structures plays a key role in dictating the degree of data movement. Vertex reordering is a widely used technique to improve data locality within graph data structures. However, these reordering schemes alone are not sufficient as they need to be complemented with efficient task allocation on manycore GPU architectures to reduce latency due to local cache misses. Consequently, in this article, we introduce a software/hardware co-design framework for accelerating graph computations. Our approach couples an architecture-aware vertex reordering with a priority-based task allocation technique. As the task allocation aims to reduce on-chip latency and associated energy, the choice of Network-on-Chip (NoC) as the communication backbone in the manycore platform is an important parameter. By leveraging emerging three-dimensional (3D) integration technology, we propose design of a small-world NoC (SWNoC)-enabled manycore GPU architecture, where the placement of the links connecting the streaming multiprocessors (SMs) and the memory controllers (MCs) follow a power-law distribution. The proposed 3D SWNoC-enabled software/hardware co-design framework achieves 11.1% to 22.9% performance improvement and 16.4% to 32.6% less energy consumption depending on the dataset and the graph application, when compared to the default order of dataset running on a conventional planar mesh architecture.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Software/Hardware Co-design of 3D NoC-based GPU Architectures for Accelerated Graph Computations

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Design Automation of Electronic Systems

Lead the way for us

Journal: ACM Transactions on Design Automation of Electronic Systems	Publication Date: Jun 27, 2022
Citations: 5

Similar Papers

Communication Optimization for Efficient Dynamic Task Allocation in Swarm Robotics
Nadia Nedjah ... Luiza De Macedo Mourelle
-
Nadia Nedjah, et. al.Nadia Nedjah ... Luiza De Macedo Mourelle
01 Jan 2020
01 Jan 2020

Communication optimization for efficient dynamic task allocation in swarm robotics
Nadia Nedjah ... Luiza De Macedo Mourelle
Applied Soft Computing | VOL. 105
Nadia Nedjah, et. al.Nadia Nedjah ... Luiza De Macedo Mourelle
16 Mar 2021
Applied Soft Computing | VOL. 105

P2AE: Preserving Privacy, Accuracy, and Efficiency in Location-Dependent Mobile Crowdsensing
Yili Jiang ... Yi Qian
IEEE Transactions on Mobile Computing | VOL. 22
Yili Jiang, et. al.Yili Jiang ... Yi Qian
01 Apr 2023
IEEE Transactions on Mobile Computing | VOL. 22

Machine learning based secure and efficient task allocation in multi‐cloud
Bhushan Patil ... Satish Ket
Concurrency and Computation: Practice and Experience | VOL. 35
Bhushan Patil, et. al.Bhushan Patil ... Satish Ket
29 Jun 2023
Concurrency and Computation: Practice and Experience | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Software/Hardware Co-design of 3D NoC-based GPU Architectures for Accelerated Graph Computations

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Design Automation of Electronic Systems