Abstract

In this paper, we address the issue of feeding future superscalar processor cores with enough instructions. Hardware techniques targeting an increase in the instruction fetch bandwidth have been proposed such as the trace cache microarchitecture. We present a microarchitecture solution based on a register file holding basic blocks of instructions. This solution places the instruction memory hierarchy out of the cycle determining path. We call our approach, instruction register file (IRF). We estimate our approach with a SimpleScalar based simulator run on the Mediabench benchmark suite and compare to the trace cache performance on the same benchmarks. We show that on this benchmark suite, an IRF-based processor fetching up to three basic blocks per cycle outperforms a trace-cache-based processor fetching 16 instructions long traces by 25% on the average.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call