Abstract

Energy has become a first class design constraint for all types of processors. Data accesses contribute to processor energy usage and have been shown to account for up to 25% of the total energy used in embedded processors. Using a set-associative level-one data cache (L1 DC) organization is particularly energy inefficient as load operations access all L1 DC data arrays in parallel to reduce access latency even though the data can reside in at most one of the arrays. In this presentation I will describe three techniques we have developed to reduce the energy used for L1 data accesses without adversely affecting performance. The first technique avoids unnecessary loads from the L1 DC data arrays of set associative caches by speculatively accessing the L1 DC tag arrays earlier in the pipeline and only accessing the single L1 DC data array where there was a tag match. The second technique detects when a load operation will not cause a delay with a subsequent instruction and sequentially accesses the tag and data memories to also avoid unnecessary L1 DC data array accesses. The third technique provides a practical data filter cache design that not only significantly reduces data access energy usage, but also avoids the traditional execution time penalty associated with data filter caches. All of these techniques can easily be integrated into a conventional processor without requiring any ISA changes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call