Elastic-Cache: GPU Cache Architecture for Efficient Fine- and Coarse-Grained Cache-Line Management

Bingchao Li,Jizhou Sun,Murali Annavaram,Nam Sung Kim

doi:10.1109/ipdps.2017.81

Abstract

GPUs provide high-bandwidth/low-latency on-chip shared memory and L1 cache to efficiently service a large number of concurrent memory requests (to contiguous memory space). To support warp-wide accesses to L1 cache, GPU L1 cache lines are very wide. However, such L1 cache architecture cannot always be efficiently utilized when applications generate many memory requests with irregular access patterns especially due to branch and memory divergences. In this paper, we propose Elastic-Cache that can efficiently support both fine- and coarse-grained L1 cache-line management for applications with both regular and irregular memory access patterns. Specifically, it can store 32- or 64-byte words in non-contiguous memory space to a single 128-byte cache line. Furthermore, it neither requires an extra tag storage structure nor reduces the capacity of L1 cache since it stores auxiliary tags for fine-grained L1 cache-line managements in sharedmemory space that is not fully used in many applications. Our experiment shows that Elastic-Cache improves the geo-mean performance of applications with irregular memory access patterns by 58% without degrading performance of applications with regular memory access patterns.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Elastic-Cache: GPU Cache Architecture for Efficient Fine- and Coarse-Grained Cache-Line Management

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

An Efficient GPU Cache Architecture for Applications with Irregular Memory Access Patterns
Bingchao Li ... Jizeng Wei
ACM Transactions on Architecture and Code Optimization | VOL. 16
Bingchao Li, et. al.Bingchao Li ... Jizeng Wei
17 Jun 2019
ACM Transactions on Architecture and Code Optimization | VOL. 16

LLM: Realizing Low-Latency Memory by Exploiting Embedded Silicon Photonics for Irregular Workloads
Marjan Fariborz ... Samuel Palermo
-
Marjan Fariborz, et. al.Marjan Fariborz ... Samuel Palermo
01 Jan 2021
01 Jan 2021

Adaptive Regression Prefetching Algorithm by Using Big Data Application Characteristics
Mengzhao Zhang ... Qian Tang
Applied Sciences | VOL. 13
Mengzhao Zhang, et. al.Mengzhao Zhang ... Qian Tang
31 Mar 2023
Applied Sciences | VOL. 13

APMC
Tassadaq Hussain ... Mateo Valero
-
Tassadaq Hussain, et. al.Tassadaq Hussain ... Mateo Valero
26 Feb 2014
26 Feb 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Elastic-Cache: GPU Cache Architecture for Efficient Fine- and Coarse-Grained Cache-Line Management

Abstract

Talk to us

Similar Papers