Abstract

Hardware-based implementations of the Fast Fourier Transform (FFT) are highly regarded as they provide improved performance characteristics with respect to software-based sequential solutions. Due to the high number of operations involved in calculations, most hardware-based FFT approaches completely or partially fold their structure to achieve an efficient use of resources. A folding operation requires a permutation block, which is typically implemented using either permutation logic or address generation. Addressing schemes offer resource-efficient advantages when compared to permutation logic. We propose a systematic and scalable procedure for generating permutation-based address patterns for any power-of-2 transform size algorithm and any folding factor in FFT cores. To support this procedure, we develop a mathematical formulation based on Kronecker products algebra for address sequence generation and data flow pattern in FFT core computations, a well-defined procedure for scaling address generation schemes, and an improved approach in the overall automated generation of FFT cores. We have also performed an analysis and comparison of the proposed hardware design performance with respect to a similar strategy reported in the recent literature in terms of clock latency, performance, and hardware resources. Evaluations were carried on a Xilinx Virtex-7 FPGA (Field Programmable Gate Array) used as implementation target.

Highlights

  • The Fast Fourier Transform (FFT) is the main block in many electronic communications, signal processing, and scientific computing operations

  • The output change of the DAG as indicated in Equations (17) and (19), the permutation to perform by the Data Switch Read (DSR) and Data Switch Write (DSW) as indicated in Equations (27) and (32), the phase factor addresses for the processing elements (PEs) and as indicated in Equation (28), the reading and writing process in the Memory Banks (MBs) as indicated in Equations (25) and (34)

  • This paper proposed a method to generate scalable FFT cores based on an address generation scheme for FPGA implementation when the folding factor is scaled

Read more

Summary

Introduction

The Fast Fourier Transform (FFT) is the main block in many electronic communications, signal processing, and scientific computing operations. A scalable vertical folding process affects the overall latency associated with memory operations It creates the need for a permutation block which is in charge of controlling the data flow between stages. Our address generation scheme satisfies this design requirement without the need of a reordering procedure, and it is independent of pipeline overlapping techniques, such as those reported in works by Richardson et al [23] Despite this diverse set of implementations, there are few approaches addressing the issue of using an HCS to allow for scalable folding processing. Programmable Gate Array (FPGA) implementation when the vertical folding factor is optimized, a mathematical procedure to automatically generate address patterns in FFT computations, a set of stride group permutations to guide data flow in hardware architectures, and a set of guidelines about space/time tradeoffs for FFT algorithm developers.

Definitions
Kronecker Products Formulation of the Scalable FFT with Address Generation
Data Address Generator
Phase Factor Scheduling
FPGA Implementation
Validation
Timing Performance
Resource Consumption
Analysis
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call