Distributed-Memory-Based FFT Architecture and FPGA Implementations

J Nash

doi:10.3390/electronics7070116

Abstract

A new class of fast Fourier transform (FFT) architecture, based on the use of distributed memories, is proposed for field-programmable gate arrays (FPGAs). Prominent features are high clock speeds, programmability, reduced look-up-table (LUT) and register usage, simplicity of design, and a capability to do both power-of-two and non-power-of-two FFTs. Higher clock speeds are a consequence of new algorithms and a more fine-grained structure compared to traditional pipelined FFTs, so clock speeds are typically >500 MHz in 65 nm FPGA technology. The programmability derives from the memory-based architecture, which is also scalable. Reduced LUT and register usage arises from a unique methodology to control word growth during computation that achieves high dynamic range, along with inherent systolic circuit characteristics: simple, regular, uniform arrays of processing elements, connected in nearest-neighbor fashion to minimize wiring lengths. The circuit goal was to maximize throughput and minimize the use of the FPGA LUT and register logic fabric. Comparison results from seven different designs, covering a spectrum of functionality (fixed-size, variable, floating-point and variable non-power-of-two FFTs), different FPGA vendors (Intel and Xilinx) and different FPGA types, showed increases in throughput per logic cell up to 181% with an average improvement of 94%.

Highlights

The discrete Fourier transform (DFT) is one of the most prominent signal processing algorithms and is used in a variety of applications within engineering, computer science, physics, and mathematics [1,2].Since many of these applications are real-time or involve computations on large data sets, special purpose parallel circuitry coupled with fast Fourier transform (FFT) algorithms for reducing DFT computation times, is essential
If a DFT can be factored into a product of small numbers, the basic idea is for the distributed-memory-based architecture (DMBA) to sequentially perform an appropriate series of transforms on these to produce the DFT output
Seven different field-programmable gate arrays (FPGAs) FFT implementations are described, with the purpose of demonstrating how the same architecture can be used for a range of applications

Summary

Introduction

SC-FDMA is a part of the LTE protocol [3] used for up-link data transmission. It involves a DFT pre-coding of the transmitted signal, where the DFT can be any one of 35 transform sizes from 12-points to 1296-points, with N = 2a 3b 5c and a, b, c positive integers. The rationale for targeting FPGAs is due to the rapidly growing FPGA use in communications applications, e.g., base stations and remote radio heads at the top of cell phone towers. We provide results of mapping the DMBA to Xilinx

FPGA Implementations

Related Work

Algorithm

Base-b Algorithm

Matrix–Matrix Systolic Array

Architecture

Column DFTs

Row DFTs

Reachable Transform Sizes

Physical Array

Programmability

Dynamic Range

Method

Floating-Point without FPGA Embedded Hardware Support

Floating Point with Embedded Hardware Support

DMBA Design Approach

On-the-Fly-Twiddle Coefficient Calculation

DMBA LTE SC-FDMA Transform Throughput and Latency

Comparison with Commercial Circuits

Other FPGA LTE Implementations

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronics	Publication Date: Jul 17, 2018
Citations: 9	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Distributed-Memory-Based FFT Architecture and FPGA Implementations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

Chapter 31 - The Implications of Floating Point for FPGAs
Keith D Underwood ... K Scott Hemmert
Reconfigurable Computing | VOL. -
Keith D Underwood, et. al.Keith D Underwood ... K Scott Hemmert
01 Jan 2008
Reconfigurable Computing | VOL. -

Efficient FPGA based architecture for high‐order FIR filtering using simultaneous DSP and LUT reduced utilization
Mountassar Maamoun ... Adnane Hassani
IEE Proceedings - Circuits, Devices and Systems | VOL. 15
Mountassar Maamoun, et. al.Mountassar Maamoun ... Adnane Hassani
21 Feb 2021
IEE Proceedings - Circuits, Devices and Systems | VOL. 15

Power modeling and characteristics of field programmable gate arrays
Fei Li ... Deming Chen
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | VOL. 24
Fei Li, et. al. Fei Li ... Deming Chen
01 Nov 2005
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | VOL. 24

Dynamic Reconfiguration Technologies Based on FPGA in Software Defined Radio System
Ke He ... Louise Crockett
Journal of Signal Processing Systems | VOL. 69
Ke He, et. al.Ke He ... Louise Crockett
16 Dec 2011
Journal of Signal Processing Systems | VOL. 69

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Distributed-Memory-Based FFT Architecture and FPGA Implementations

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics