Abstract

In order to decrease latency and energy consumption, processors use hierarchical memory systems to store temporally and spatially related instructions close to the core. Instruction register file (IRF) is an energy-efficient solution for the lowest level in the instruction memory hierarcy. Being compiler-controlled, it removes the area and energy overheads involved in cache tag checking and adds flexibility in the separation of the instruction fetch and execution. In this paper, we systematically evaluate for the first time the effect of three IRF design variations on energy and delay against an unoptimized baseline IRF. Having instruction fetch and decode with IRF in the same pipeline stage allows minimal delay branching, but results in low operating clock frequency and impaired energy delay product compared to splitting them into two stages. Assuring instruction presence in IRF before execution with software reduces the area and increases maximum clock frequency compared to assurance with hardware, but requires compiler analysis. With a proposed compiler-analyzed instruction placement and co-designed hardware implementation, energy consumption with the best IRF variant is reduced by 9% on average with EEMBC Coremark and CHStone benchmaks. The energy delay product is improved by 23% when compared to the baseline IRF approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call