Understanding how off-chip memory bandwidth partitioning in Chip Multiprocessors affects system performance

Fang Liu Fang Liu,Yan Solihin Yan Solihin,Xiaowei Jiang Xiaowei Jiang

doi:10.1109/hpca.2010.5416655

Abstract

Chip Multi-Processor (CMP) architectures have recently become a mainstream computing platform. Recent CMPs allow cores to share expensive resources, such as the last level cache and off-chip pin bandwidth. To improve system performance and reduce the performance volatility of individual threads, last level cache and off-chip bandwidth partitioning schemes have been proposed. While how cache partitioning affects system performance is well understood, little is understood regarding how bandwidth partitioning affects system performance, and how bandwidth and cache partitioning interact with one another. In this paper, we propose a simple yet powerful analytical model that gives us an ability to answer several important questions: (1) How does off-chip bandwidth partitioning improve system performance? (2) In what situations the performance improvement is high or low, and what factors determine that? (3) In what way cache and bandwidth partitioning interact, and is the interaction negative or positive? (4) Can a theoretically optimum bandwidth partition be derived, and if so, what factors affect it? We believe understanding the answers to these questions is very valuable to CMP system designers in coming up with strategies to deal with the scarcity of off-chip bandwidth in future CMPs with many cores on a chip.

Full Text