Abstract

Increasing instruction level parallelism is one basic way to improve the performance of superscalar processors. True data dependency is the first issue to address when it comes to increasing the performance of the processor. The chain technique, which bypasses the execution result from one arithmetic logic unit to others, is an effective method to reduce the true data dependency without using a high frequency clock. We have developed an optimal scheduling for the chain technique that uses dependency maps to store the data dependency information and utilise it to issue the chained instructions effectively. The experimental results show that the chain technique improves the performance from about 2% to 25% in CommBench and SPECint2000. The hardware configuration shows that our proposed scheduling can be implemented using just static random access memories and small-scale control logics, with no need for larger scale hardware.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call