MViD: Sparse Matrix-Vector Multiplication in Mobile DRAM for Accelerating Recurrent Neural Networks

Byeongho Kim,Sunjung Lee,Jaehyun Park,Eojin Lee,Jongwook Chung,Wonkyung Jung,Sukhan Lee,Jaewan Choi,Jung Ho Ahn,Minbok Wi

doi:10.1109/tc.2020.2984496

Abstract

Recurrent Neural Networks (RNNs) spend most of their execution time performing matrix-vector multiplication (MV-mul). Because the matrices in RNNs have poor reusability and the ever-increasing size of the matrices becomes too large to fit in the on-chip storage of mobile/IoT devices, the performance and energy efficiency of MV-mul is determined by those of main-memory DRAM. Therefore, computing MV-mul within DRAM draws much attention. However, previous studies lacked consideration for the matrix sparsity, the power constraints of DRAM devices, and concurrency in accessing DRAM from processors while performing MV-mul. We propose a main-memory architecture called MViD, which performs MV-mul by placing MAC units inside DRAM banks. For higher computational efficiency, we use a sparse matrix format and exploit quantization. Because of the limited power budget for DRAM devices, we implement the MAC units only on a portion of the DRAM banks. We architect MViD to slow down or pause MV-mul for concurrently processing memory requests from processors while satisfying the limited power budget. Our results show that MViD provides 7.2× higher throughput compared to the baseline system with four DRAM ranks (performing MV-mul in a chip-multiprocessor) while running inference of Deep Speech 2 with a memory-intensive workload.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

MViD: Sparse Matrix-Vector Multiplication in Mobile DRAM for Accelerating Recurrent Neural Networks

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computers

Lead the way for us

Journal: IEEE Transactions on Computers	Publication Date: Jan 1, 2020
Citations: 58

Similar Papers

Acceleration of Karhunen-Loève transform for System-on-Chip platforms
Chafik Egho ... Tanya Vladimirova
-
Chafik Egho, et. al.Chafik Egho ... Tanya Vladimirova
01 Jun 2012
01 Jun 2012

Efficient and Robust Deep Learning through Approximate Computing

-

28 Jul 2020
28 Jul 2020

DRAM Translation Layer: Software-Transparent DRAM Power Savings for Disaggregated Memory
Wenjing Jin ... Wonsuk Jang
-
Wenjing Jin, et. al.Wenjing Jin ... Wonsuk Jang
17 Jun 2023
17 Jun 2023

Power-Performance Implications of Thread-level Parallelism on Chip Multiprocessors
Jian Li ... J.F Martinez
-
Jian Li, et. al. Jian Li ... J.F Martinez
01 Jan 2004
01 Jan 2004

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MViD: Sparse Matrix-Vector Multiplication in Mobile DRAM for Accelerating Recurrent Neural Networks

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computers