Abstract

In this paper we compare the optimum performance of the fast Fourier transform (FFT) on torus and hypercube multicomputers. The optimal number of floating point operations is architecturally invariant so the relative performance is determined by communication time. We compare the performance of known multiport communication algorithms that utilize the full bandwidth of the interconnection network on every cycle. Furthermore, data is routed unblocked over paths with minimum length. Specifically we compare FFT performance on the hypercube as well as two-, three- and four-dimensional torus architectures. On the hypercube we observe the somewhat surprising result that, in the limit, communication is ultimately negligible compared to computation. While the opposite is true of the torus, it is nevertheless possible to obtain comparable performance over a broad range of processors and problem sizes. As computers are built with increasing numbers of processors, torus performance can still be made comparable to the hypercube by gradually increasing the dimension of the interconnect. Any number of processors can be used to compute the FFT with efficiency that is theoretically bounded from zero.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.