Last level cache layout remapping for heterogeneous systems

Licheng Yu,Tianzhou Chen,Minghui Wu,Xueqing Lou

doi:10.1016/j.sysarc.2018.05.002

Abstract

Abstract Heterogeneous systems with CPU and GPGPU sharing the last level cache (LLC) provide viability and flexibility. However, the different programming models lead to conflicting memory layouts, which are required for best performance of different processors. Software converting that directly accesses target layout is subject to sub-optimal localities. Converting in GPGPU shared memory also incurs copying and synchronization overhead. In this paper, we analyze the memory layout requirement and propose to remap the memory layout in the shared LLC. A remap controller in LLC executes a simple program that calculates target requests from an LLC request in the source memory space. The LLC request is thus remapped to the target memory space with the generated requests. Consequently, all processors always access memory in their optimal data layouts. The locality is thus kept through all the private caches, and software remapping overhead is also eliminated. The tiled-matrix multiplication is discussed as a case study and benchmarks from Polybench/GPU and Rodinia are modified to take advantage of the LLC layout remapping. The experiment results show the average benchmark execution time is decreased to 69%. Compared with CPU software layout converting, the CPU time is decreased to 41%–73%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Last level cache layout remapping for heterogeneous systems

Abstract

Talk to us

Similar Papers

More From: Journal of Systems Architecture

Lead the way for us

Similar Papers

재구성 가능한 라스트 레벨 캐쉬 구조를 위한 코어 인지 캐쉬 교체 기법
Dong-Oh Son ... Jong-Myon Kim
Journal of the Korea Society of Computer and Information | VOL. 18
Dong-Oh Son, et. al.Dong-Oh Son ... Jong-Myon Kim
29 Nov 2013
Journal of the Korea Society of Computer and Information | VOL. 18

ZPP: A Dynamic Technique to Eliminate Cache Pollution in NoC based MPSoCs
Dipika Deb ... John Jose
ACM Transactions on Embedded Computing Systems | VOL. 22
Dipika Deb, et. al.Dipika Deb ... John Jose
09 Sep 2023
ACM Transactions on Embedded Computing Systems | VOL. 22

LLC Buffer for Arbitrary Data Sharing in Heterogeneous Systems
Yu Licheng ... Chen Tianzhou
-
Yu Licheng, et. al.Yu Licheng ... Chen Tianzhou
01 Dec 2016
01 Dec 2016

Dynamic Program Behavior Identification for High Performance CMPs with Private LLCs
Xiaomin Jia ... Caixia Sun
IEICE Transactions on Information and Systems | VOL. E93-D
Xiaomin Jia, et. al.Xiaomin Jia ... Caixia Sun
01 Jan 2009
IEICE Transactions on Information and Systems | VOL. E93-D

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Last level cache layout remapping for heterogeneous systems

Abstract

Talk to us

Similar Papers

More From: Journal of Systems Architecture