Abstract

The systolic concept in the parallel architecture design proposed by the H. T. Kung [1,2] obtains high throughput and speedups. The linear array for the matrix vector multiplication executes the algorithm in 2n − 1 time steps using 2n − 1 processors. Although the speedup obtained is very high, the efficiency is very poor (typical values of 25% efficiency for problem size greater than 10). H. T. Kung proposed an idea for a linear systolic array using two data streams flowing in opposite directions. However, the processors in the array perform operations in every second time moment. Attempts to improve this design have been made by many researchers. Nonlinear and folding transformations techniques [3,4,5] only decrease the number of processors used to half the size, but do not affect the time. We propose the use of a fast linear systolic computation procedure to obtain a solution that uses 3n/2 processors and executes the algorithm in 3n/2 time steps for the same cells, the same communication and the same r...

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.