Abstract

Graph neural networks (GNN) are of great interest in real-life applications such as citation networks, drug discovery owing to GNN’s ability to apply machine learning techniques on graphs. GNNs utilize a two-step approach to classify the nodes in a graph into pre-defined categories. The first step uses a combination kernel to perform data-intensive convolution operations with regular memory access patterns. The second step uses an aggregation kernel that operates on sparse data having irregular access patterns. These mixed data patterns render CPU/GPU based compute energy-inefficient. Von-Neumann-based accelerators like AWB-GCN [7] suffer from increased data movement, as the data-intensive combination requires large data movement to/from memory to perform computations. ReFLIP [8] performs Resistive Random Access memory-based in-memory (PIM) compute to overcome data movement costs. However, ReFLIP suffers from increased area requirement due to dedicated accelerator arrangement, reduced performance due to limited parallelism and energy due to fundamental issues in ReRAM-based compute. This paper presents a scalable (non-exponential storage requirement), DAC/ADC-less PIM-based combination, with (i) early compute termination, (ii) pre-compute by reconfiguring SOC components. Graph and sparsity-aware near-memory aggregation using the proposed compute-as-soon-as-ready (CAR), broadcast approach improves performance and energy further. NEM-GNN achieves ∼ 80-230x, ∼ 80-300x, ∼ 850-1134x, and ∼ 7-8x improvement over ReFLIP, in terms of performance, throughput, energy efficiency and compute density.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call