Abstract

In this paper, we consider the implementation of a product c=A b, where A is N1×N3 band matrix with bandwidth ω and b is a vector of size N 3×1, on bidirectional and unidirectional linear systolic arrays (BLSA and ULSA, respectively). We distinguish the cases when the matrix bandwidth ω is 1≤ω≤N 3 and N3≤ω≤N 1+N3-1. A modification of the systolic array synthesis procedure based on data dependencies and space-time transformations of data dependency graph is proposed. The modification enables obtaining both BLSA and ULSA with an optimal number of processing elements (PEs) regardless of the matrix bandwidth. The execution time of the synthesized arrays has been minimized. We derive explicit formulas for the synthesis of these arrays. The performances of the designed arrays are discussed and compared to the performances of the arrays obtained by the standard design procedure.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call