Abstract

ABSTRACT This paper discusses an implementation of the two dimensional fast Fourier transform (FFT) on SPRINT, the SystolicProcessor with a Reconfigurable Interconnection Network of Transputers. SPRINT is a 64 element multiprocessordeveloped at Lawrence Livermore National Laboratory for the experimental evaluation of systolic algorithms andarchitectures. The implementation is a radix two decimation in time algorithm, valid for an arbitrary sized p x qmesh of processors and an arbitrary sized P x Q complex input array (P, Q, p, and q must all be powers of two). Theprocessors are interconnected with their nearest neighbors along North -South -East -West communication links. Theproblems of array partitioning, bit reversal, sub -array transform computation, and weighted (butterfly) combinationsare all discussed. Finally, benchmark results are presented, and speedup and efficiency are discussed. 1. INTRODUCTION The FFT is probably the single most important fundamental tool in signal and image processing today. However,it is very compute intensive, often requiring many minutes of computation time to transform large two dimensionalarrays. Researchers are constantly looking for more efficient ways to compute the FFT to improve throughput insignal and image processing applications. In recent years, interest in parallel processing has grown rapidly, andcomparatively low cost parallel processors with computational speeds rivaling those of supercomputers are nowavailable to the researcher. It is a natural consequence then, that researchers in signal and image processing shouldturn to parallel processing platforms in an effort to achieve this increased throughput. Over the past several years,we have developed the Systolic Processor with a Reconfigurable Interconnection Network of Transputers (SPRINT),a 64 processor multiprocessor, to support research in systolic algorithms and architectures and parallel processingin general.' Given the ever increasing need to compute larger and larger two dimensional FFTs in image processingapplications, and the availability of the SPRINT multiprocessor, we have developed a multiprocessor implementationof the two dimensional FFT. This paper describes the implementation on the SPRINT, the problems we encounteredand the design choices we were required to make.The paper is organized as follows: First a brief discussion of the architecture of the SPRINT, and the FFT algorithmwe chose to implement. Next, the implementation is discussed in detail with emphasis on array partitioning, bitreversal, sub -array transform computation, and weighted (butterfly) combinations. Finally, benchmark results arepresented, and speedup and efficiency are discussed.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call