Abstract
There are a growing number of micro-architectures that employ pipelined array units for high-speed floating point applications. To obtain the performance required in such applications, effective loop optimization is crucial. The loop optimization problem for micro-architectures with pipelined processing units is similar to the problem of array processor loop optimization. The URPR method has been proven to be an effective, low-complexity approach to optimizing loops in array processor programs, so we conducted a case study of the method applied to a representative pipelined microarchitecture based on the AMD29500 chip family. The results of applying two URPR compaction algorithms and a new compaction algorithm to the 29500-based micro-architecture are presented. With the new compaction algorithm, we were able to realize microcode as efficient as the complex microcode manually derived by AMD.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have