Abstract

Given a regular application described by a system of uniform recurrence equations, systolic arrays are commonly derived by means of an affine transformation; an affine schedule determines when the computations are performed and an affine processor allocation where they are performed. Circuit transformations are then applied on the resulting circuit when the application needs to be mapped onto a smaller size array. This method is in two steps and thus can hardly be optimized globally. We hereafter present a different method for designing small size arrays. We derive them in one step by means of an affine schedule and a near-affine processor allocation. By doing so, we can generalize the optimization technique for affine mapping to be applicable here. The method is illustrated on the band-matrix multiplication and on the convolution algorithms.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call