Abstract

A parallel algorithm that makes use of the classical three-term recursion formula to construct an orthogonal family of polynomials with respect to a discrete inner product is proposed. The algorithm requires O( N log N) parallel arithmetic steps on a distributed-memory multiprocessor with N + 1 processors to construct the polynomials p i ( x) for 0 ≤ i ≤ N. If hypercube topology is assumed, the algorithm can be implemented with the additional overhead of O( N log N) routing steps. In this case the implementation is quite simple, requiring only scalar single node broadcast and accumulation procedures together with a Gray code mapping. The limited processor version of the algorithm requires O( N 2 / p + N log p) arithmetic and O( N log p) routing steps on a hypercube with p ≤ N + 1 nodes. We present some experimental results obtained on an Intel cube.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call