A system level perspective on branch architecture performance

Brad Calder ,Dirk Grunwald ,Joel Emer

doi:10.5555/225160.225194

Abstract

Accurate instruction fetch and branch prediction is increasingly important on today's wide-issue architectures. Fetch prediction is the process of determining the next instruction to request from the memory subsystem. Branch prediction is the process of predicting the likely outcome of branch instructions. Many branch and fetch prediction architectures have been proposed, from simple static techniques to more sophisticated hardware designs. All these previous studies compare differing branch prediction architectures in terms of misprediction rates, branch penalties, or an idealized cycles per instruction. This paper provides a system-level performance comparison of several branch architectures using a full pipeline-level architectural simulator. The performance of various branch architectures is reported using execution time and cycles-per-instruction. For the programs we measured, our simulations show that having no branch prediction increases the execution time by 27%. By comparison, a highly accurate 512 entry branch target buffer architecture has an increased execution time of 1.5% when compared to an architecture with perfect branch prediction. We also show that the most commonly used branch performance metrics, branch misprediction rates and the branch execution penalty are highly correlated with program performance and are suitable metrics for architectural studies.

Full Text