Victim Replication

Michael Zhang,Krste Asanovic

doi:10.1145/1080695.1069998

Abstract

In this paper, we consider tiled chip multiprocessors (CMP) where each tile contains a slice of the total on-chip L2 cache storage and tiles are connected by an on-chip network. The L2 slices can be managed using two basic schemes: 1) each slice is treated as a private L2 cache for the tile 2) all slices are treated as a single large L2 cache shared by all tiles. Private L2 caches provide the lowest hit latency but reduce the total effective cache capacity, as each tile creates local copies of any line it touches. A shared L2 cache increases the effective cache capacity for shared data, but incurs long hit latencies when L2 data is on a remote tile. We present a new cache management policy, victim replication, which combines the advantages of private and shared schemes. Victim replication is a variant of the shared scheme which attempts to keep copies of local primary cache victims within the local L2 cache slice. Hits to these replicated copies reduce the effective latency of the shared L2 cache, while retaining the benefits of a higher effective capacity for shared data. We evaluate the various schemes using full-system simulation of both single-threaded and multi-threaded benchmarks running on an 8-processor tiled CMP. We show that victim replication reduces the average memory access latency of the shared L2 cache by an average of 16%for multi-threaded benchmarks and 24%for single-threaded benchmarks, providing better overall performance than either private or shared schemes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Victim Replication

Abstract

Talk to us

Similar Papers

More From: ACM SIGARCH Computer Architecture News

Lead the way for us

Journal: ACM SIGARCH Computer Architecture News	Publication Date: May 1, 2005
Citations: 303

Similar Papers

Victim Replication: Maximizing Capacity while Hiding Wire Delay in Tiled Chip Multiprocessors
M Zhang ... K Asanovic
-
M Zhang, et. al.M Zhang ... K Asanovic
28 Jul 2005
28 Jul 2005

Controlled replication: reduce the capacity occupied by redundant replicas in tiled chip multiprocessors
Hao Li ... Lunguo Xie
-
Hao Li, et. al.Hao Li ... Lunguo Xie
04 Mar 2013
04 Mar 2013

Reusability-aware cache memory sharing for chip multiprocessors with private L2 caches
Hyunhee Kim ... Sungjun Youn
Journal of Systems Architecture | VOL. 55
Hyunhee Kim, et. al.Hyunhee Kim ... Sungjun Youn
18 Sep 2009
Journal of Systems Architecture | VOL. 55

A reusability-aware cache memory sharing technique for high-performance low-power CMPs with private L2 caches
Sungjune Youn ... Hyunhee Kim
-
Sungjune Youn, et. al.Sungjune Youn ... Hyunhee Kim
27 Aug 2007
27 Aug 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Victim Replication

Abstract

Talk to us

Similar Papers

More From: ACM SIGARCH Computer Architecture News