Abstract

Register renaming is a performance-critical component of modern, dynamically-scheduled processors. Register renaming latency increases as a function of several architectural parameters (e.g., processor issue width, processor window size, and processor checkpoint count). Pipelining of the register renaming logic can help avoid restricting the processor clock frequency. This work presents a full-custom, two-stage register renaming implementation in a 130-nm fabrication technology. The latency of non-pipelined and two-stage, pipelined renaming is compared, and the underlying performance and complexity tradeoffs are discussed. The two-stage pipelined design reduces the renaming logic depth from 23 fan-out-of-four (FO4) down to 9.5 FO4.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call