Abstract

We address the problem of deriving systolic arrays in which the processor utilization is 100%. We first address this problem in the context of synthesis from Uniform Recurrence Equations (UREs), and then generalize our result to deal with arbitrary systolic arrays (outside the context of synthesis). We show that in a systolic array, it is always possible to merge a parallelepiped of neighboring processors which are active at different clock cycles. The new array is fully efficient and its processors have almost the same cost as the original one. Such merging corresponds exactly to the transformation by a quasi-linear function. When the original array is derived by integral linear projections of systems of UREs, we give a method to mechanically determine the quasi-linear allocation function which yields the efficient array. The technique can also be extended to any (piece-wise) systolic array to derive a fully efficient array by “post-processing” it with a (piece-wise) quasi-linear function.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call