Abstract

The increasing widths of superscalar processors are placing greater demands upon the fetch mechanism. The trace cache meets these demands by placing logically contiguous instructions in physically contiguous storage. It is capable of supplying multiple fetch blocks each cycle. We examine two fetch and issue techniques, partial matching and inactive issue, that improve the overall performance of the trace cache by improving the effective fetch rate. We show that for the SPECint95 benchmarks partial matching increases the overall performance by 12% and adding inactive issue increases performance by 15%. Furthermore we apply these two techniques to issue blocks from trace segments which contain multiple execution paths. We conclude with a performance comparison between a trace cache implementing partial matching and inactive issue and an aggressive single block fetch mechanism. The trace cache increases performance by an average of 25% over the instruction cache.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call