Memory Efficient High Speed Systolic Array Architecture Design with Multiplexed Distributed Arithmetic for 2D DTCWT Computation on FPGA

B Poornima ,A Sumathi ,Cyril Prasanna Raj Premkumar

doi:10.33180/infmidem2019.301

Abstract

This paper presents customized Systolic Array Architecture (SAA) design of Dual Tree Complex Wavelet (DTCWT) sub band computation based on multiplexed Distributive Arithmetic Algorithm (DAA). The proposed architecture is memory efficient and operates at frequencies greater than 300 MHz in decomposing 256 x 256 input images. Three architectures such as reduced order structure, multiplexed DA structure and zero pad structure are designed and evaluated for its performances for DTCWT computation minimizing arithmetic operations with improved latency. The proposed design is modeled in Verilog HDL and is implemented on Spartan-6 and Virtex-5 FPGA considering Xilinx ISE FPGA design flow. The latency of proposed architectures is evaluated to be 15 clock cycles and throughput is estimated to be 4 outputs for every 5 clock cycles. The SAA architecture occupies less than 12% of FPGA resources and consumes less than 10 mW of power on FPGA platform.

Highlights

Wavelets have played an important role in signal and image processing applications supporting both time and frequency localization property
Divakara et al [25] have reported on FPGA implementation of Dual Tree Complex Wavelet Transforms (CWT) (DTCWT) for image processing applications based on reorder and symmetric structure
We have proposed three architectures for DTCWT computation optimizing area and timing requirement

Summary

Introduction

Wavelets have played an important role in signal and image processing applications supporting both time and frequency localization property. Simplified structures for computing DTCWT are presented in [5,6,7] that require two real DWT filter bank structures or two critically sampled DWTs that process the input data in parallel. The first stage comprising of two filter pairs processes input image along the rows to generate output samples represented as {y1, y2, y3 and y4}. For an image of size N x N for row processing using one filter it requires 10N2 and 9N2 multipliers and adders respectively. Processing input data using 12 filters (both first stage and second stage), total number of multipliers and adders operations required are 120N2 and 108N2 respectively. Implementing DTCWT on FPGA platform requires optimizing number of arithmetic operations and memory elements. Few of the most popular methods for DWT implementation improving speed and optimizing area are reviewed that can provide an insight into the improved methods that are proposed in this work for DTCWT implementation

Review of high speed architectures

DTCWT architecture design

Reduced order architecture

Multiplexed DA architecture

A2 A1 A0 LUT Contents

Zero pad architecture

Systolic array architecture

Comparison of DTCWT architectures

FPGA Implementation

Findings

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Memory Efficient High Speed Systolic Array Architecture Design with Multiplexed Distributed Arithmetic for 2D DTCWT Computation on FPGA

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Informacije MIDEM - Journal of Microelectronics, Electronic Components and Materials

Lead the way for us

Journal: Informacije MIDEM - Journal of Microelectronics, Electronic Components and Materials	Publication Date: Dec 9, 2019
License type: cc-by

Similar Papers

Design and VLSI Implementation of Efficient Systolic Array Architectures for High-Speed Digital Signal Processing Applications
D V Poornaiah ... M O Ahmad
-
D V Poornaiah, et. al.D V Poornaiah ... M O Ahmad
01 Nov 1989
01 Nov 1989

Systolic VLSI and FPGA Realization of Artificial Neural Networks
Pramod Kumar Meher
-
Pramod Kumar MeherPramod Kumar Meher
01 Jan 2009
01 Jan 2009

Systolic array architecture implementation of parasitic-insensitive switched-capacitor filters
R Raut ... B.B Bhattacharyya
IEE Proceedings G Circuits, Devices and Systems | VOL. 139
R Raut, et. al.R Raut ... B.B Bhattacharyya
01 Jan 1992
IEE Proceedings G Circuits, Devices and Systems | VOL. 139

An automated design specification and verification tool for systolic architectures
T Shih ... R Davis
-
T Shih, et. al.T Shih ... R Davis
01 Jan 1992
01 Jan 1992

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Memory Efficient High Speed Systolic Array Architecture Design with Multiplexed Distributed Arithmetic for 2D DTCWT Computation on FPGA

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Informacije MIDEM - Journal of Microelectronics, Electronic Components and Materials