FRF: Toward Warp-Scheduler Friendly STT-RAM/SRAM Fine-Grained Hybrid GPGPU Register File Design

Quan Deng,Jun Yang,Zhenyu Zhao,Minxuan Zhang,Youtao Zhang,Shuzheng Zhang

doi:10.1109/tcad.2019.2946808

Abstract

Modern graphics processing units (GPUs) exhibit increasing demands for register files (RFs) with larger capacity and bank sizes, which jeopardize the traditional SRAM-based RF designs due to their large die area and long access latency. Recent hybrid RF designs, e.g., SRAM and spin-transfer torque random access memory (STT-RAM)-based RFs, mitigate the issue by exploiting the density and performance advantages in STT-RAM and SRAM, respectively. However, existing hybrid RF designs adopt coarse integration that has limited write bandwidth between SRAM and STT-RAM, which restricts the adoption of different warp schedulers at runtime. In this article, we propose FRF, a warp-scheduler friendly fine-grained hybrid RF design using SRAM/STT-RAM hybrid cell (HC) structures. By integrating one SRAM cell and $N$ STT-RAM cells as one HC, FRF exploits internal write paths to enlarge the access bandwidth between SRAM and STT-RAM and thus greatly optimizes the area and performance. FRF enables the concurrent context-switching such that different warp schedulers may be adopted at runtime. FRF adopts interleaved register mapping (IRM) and on-demand register remapping to further improve the utilization of SRAM in each HC. Our experimental results show that, on average, FRF achieves 50% performance improvement and 40% energy consumption reduction over the coarse-grained hybrid design when adopting loose round-robin (LRR), and achieves 159% efficiency improvement over pure STT-RAM-based RF.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

FRF: Toward Warp-Scheduler Friendly STT-RAM/SRAM Fine-Grained Hybrid GPGPU Register File Design

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Lead the way for us

Journal: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems	Publication Date: Oct 18, 2019
Citations: 28

Similar Papers

Multi retention level STT-RAM cache designs with a dynamic refresh scheme
Zhenyu Sun ... Weng-Fai Wong
-
Zhenyu Sun, et. al.Zhenyu Sun ... Weng-Fai Wong
03 Dec 2011
03 Dec 2011

Micro‐architectural approach to the efficient employment of STTRAM cells in a microprocessor register file
Bahar Asgari ... Mahdi Fazeli
Iet Computers and Digital Techniques | VOL. 11
Bahar Asgari, et. al.Bahar Asgari ... Mahdi Fazeli
30 Sep 2016
Iet Computers and Digital Techniques | VOL. 11

STT-RAM Cache Hierarchy With Multiretention MTJ Designs
Zhenyu Sun ... Xiuyuan Bi
IEEE Transactions on Very Large Scale Integration Systems | VOL. 22
Zhenyu Sun, et. al.Zhenyu Sun ... Xiuyuan Bi
01 Jun 2014
IEEE Transactions on Very Large Scale Integration Systems | VOL. 22

STT-RAM Cache Hierarchy Design and Exploration with Emerging Magnetic Devices
Hai Li ... Weng-Fai Wong
-
Hai Li, et. al.Hai Li ... Weng-Fai Wong
22 Oct 2013
22 Oct 2013

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

FRF: Toward Warp-Scheduler Friendly STT-RAM/SRAM Fine-Grained Hybrid GPGPU Register File Design

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems