Abstract

The aim of this paper is to show that the recently developed high performance divide and conquer algorithm for solving linear recurrence systems with constant coefficients together with the new BLAS-based algorithm for narrow-banded triangular Toeplitz matrix–vector multiplication, allow to evaluate linear recursive filters efficiently on distributed memory parallel computers. We apply the BSP model of parallel computing to predict the behaviour of the algorithm and to find the optimal values of the method's parameters. The results of experiments performed on a cluster of twelve dual-processor Itanium 2 computers and Cray X1 are also presented and discussed. The algorithm allows to utilize up to 30% of the peak performance of 24 Itanium processors, while a simple scalar algorithm can only utilize about 4% of the peak performance of a single processor. Copyright © 2006 John Wiley & Sons, Ltd.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.