Abstract

Near-threshold computing (NTC) has recently emerged and been considered as a strong candidate for future energy-efficient computing. However, adverse impacts from process variation such as delay and power fluctuations within die as well as across dies are much more severe than the super-threshold regime. In particular, static random access memory (SRAM)-based components (e.g., cache memories) are easily affected by process variation in NTC, resulting in large delay fluctuations. It incurs a huge loss in the maximum clock frequencies of processors, which eventually leads to huge yield losses. In this paper, we first analyze L1 data cache yield in NTC and reveal an inefficiency of frequency binning for yield improvement in NTC. We then introduce a variable latency L1 data cache for NTC to obtain a sufficient yield. By allowing the higher cache access cycles, we can improve cache yield with only a little performance overhead. Moreover, we propose an adaptive line migration technique which improves performance and energy efficiency of variable latency caches. The cache line which is expected to be frequently accessed in the near future is dynamically migrated to the fastest way in a cache set. According to our evaluation, our cache architecture greatly improves cache yield with only a little performance, energy, and area overhead.

Highlights

  • Process variations, manufacturing defects and variability in device parameters, have been a major threat in improving performance, energy and yield (the ratio of the number of usable chips to the number of total manufactured chips) of processors in nano-scale design era

  • We demonstrate that the necessity of employing the variable latency cache architecture for near-threshold computing (NTC) and propose an adaptive line migration technique that can be employed along with the variable latency caches for VOLUME 8, 2020 yield, performance, and energy efficiency improvements of near-threshold L1 data cache;

  • To model variable latency cache in M-SIM simulator, we assign a random latency between 3 ∼ 18 cycles to each cache line in the L1 data cache

Read more

Summary

INTRODUCTION

Process variations, manufacturing defects and variability in device parameters, have been a major threat in improving performance, energy and yield (the ratio of the number of usable (or sellable) chips to the number of total manufactured chips) of processors in nano-scale design era. In order to achieve a comparable yield under process variations, trying to keep the access cycle of L1 data caches (i.e., increase global clock cycle time) may incur severe clock frequency losses since there is much larger delay increase in NTC due to process variation than in super-threshold regime [2]. In this case, an efficient trade-off between the L1 data cache access cycles and processor clock frequency is the most critical factor for improving yield and performance of processors. We newly propose an adaptive line migration scheme which improves energy efficiency and performance of variable latency L1 data cache for NTC.

MOTIVATIONAL STUDY
ENERGY
RELATED WORK
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call