The Instruction Register File

Bernard Goossens

doi:10.1007/978-3-540-45145-7_43

Abstract

We present the Instruction Register File (IRF) coupled with a basic block translator, aiming to deliver a high instruction fetch rate. The IRF has one write port to load instruction cache blocks into registers. It also has p read ports to fetch up to p basic blocks per cycle from up to p registers. The translator predicts up to p on-path basic blocks per cycle and translates their start address into an IRF reference. The references are used in the fetch stage to read the registers and the basic blocks limits serve to merge the accessed registers into a dynamically predicted trace line. The IRF coupled with basic block descriptor tables avoid the need to cache traces as in the trace cache micro-architecture. Moreover, the IRF places the instruction memory hierarchy out of the cycle determining path, as does the data register file with the data memory hierarchy. The IRF performance is estimated with a SimpleScalar based simulator run on the Mediabench benchmark suite and compared to the trace cache performance on the same benchmarks. We show that on this benchmark suite, an IRF-based processor fetching up to 3 basic blocks per cycle outperforms a trace-cache-based processor fetching 16 instructions long traces by 25% on the average.

Full Text