A study of single-chip processor/cache organizations for large numbers of transistors

Matthew Farrens ,Gary Tyson ,Andrew R Pleszkun

doi:10.1145/191995.192066

Abstract

This paper presents a trace-driven simulation-based study of a wide range of cache configurations and processor counts. This study was undertaken in an attempt to help answer the question of how best to allocate large numbers of transistors, a question that is rapidly increasing in importance as transistor densities continue to climb. At what point does continuing to increase the size of the on-chip first level cache cease to provide sufficient increases in hit rate and become prohibitively difficult to access in a single cycle? In order to compare different configurations, the concept of an Equivalent Cache Transistor is presented. Results indicate that the access time of the first-level data cache is more important than the size. In addition, it appears that once approximately 15 million transistors become available, a two processor configuration is preferable to a single processor with correspondingly larger caches.

Full Text