Abstract

Energy efficiency has become the first priority when designing high-performance computing systems, and caches play a key role in improving the performance. However, as caches represent a dominant proportion of the total area in a processor, they dissipate a significant amount of the total energy consumption. A non-unified data cache architecture is suggested to tackle this issue without sacrificing performance. This architecture is based on the principle of providing a smaller cache for a group of frequently used references, which does not require a large cache to provide a high hit rate. Thus, the non-unified design reduces the number of accesses to a relatively large L1 data cache, thereby saving energy and enabling efficient performance. This paper examines the energy consumption of a non-unified data cache design as compared to that of a conventional unified data cache. As a powerful system requires allocation of a significant number of cores on a single die, minimizing the L1 caches in each core helps to incorporate more cores in one chip. Therefore, we conduct experiments for different sizes of L1 data caches, ranging from quite large to small. The experimental results show that compared to a conventional L1 data cache, the non-unified design reduces data cache dynamic energy consumption, on average, by up to 22% for a small data cache (such as a 4KB cache), and by up to 82% for a relatively large (such as a 32KB) cache.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call