Abstract

In this paper, a novel internal folded hardware-efficient architecture of multi-level 2-D 9/7 discrete wavelet transform (DWT) is proposed. For multi-level DWT, the unfolded structure is more extensively used compared with the folded structure, because of its low memory consumption and low time delay. However, a set of input data valid every few clock cycles caused the mismatch between clock and data in the unfolded structure. The mismatch usually needs to be solved by multi-clock or complex data adjustment, which increases the consumption of hardware resources and the complexity of the overall system. To solve the above problem of the unfolded structure, we adjust the data input timing by using a single clock domain and folding the DWT architecture of different levels in varying degrees, according to their own clock-to-data ratios. For an image of size of N × N pixels and 3-level DWT, the proposed architecture requires only 6N words temporal memory. For 3-level DWT with an image of size 512 × 512 pixels, the hardware estimation and comparison of the existing architectures show that, the hardware estimation result shows at least 30.6% area-delay-product (ADP) decrease, and at least 22.4% transistor-delay-product (TDP) decrease for S = 8, and 25.77% transistor-delay-product (TDP) decrease for S = 16.

Highlights

  • The discrete wavelet transform (DWT), as a multi-resolution analysis tool, is commonly used for image analysis, image compression and digital signal processing

  • In order to achieve the right clock-to-data ratio and meet the order for the data flow required by the row filter, the transposing buffer is needed in each 2-D DWT

  • On the assumption that the input image is N × N with 8-bit depth, the hardware consumption of the entire 3-level 2-D DWT architecture is listed in Table 1, where 1:1, 2:1 and4:1 represent the structures with 1:1, 2:1 and 4:1 clock-to-data ratio respectively, the clock-to-data ratio represents the number of clock cycles it takes to get a valid LL component of data, and S represents the parallelism

Read more

Summary

Introduction

The discrete wavelet transform (DWT), as a multi-resolution analysis tool, is commonly used for image analysis, image compression and digital signal processing. [12] presented a scalable parallel architecture of multi-level 2-D DWT based on lifting scheme Temporal memory of this architecture is reduced to zero in the first level, by overlapping seven pixels. [15] used an innovative block based Z types memory scanning method of their own way for reducing the total processing time, but it’s not a multi-level architecture. From the researches of the existing 2-D DWT architectures, it can be observed that, compared with the folded structure, the unfolded structure has smaller critical path delay and lower requirement for external memory accesses. Based on the lifting scheme, we attempt to develop a high-throughput and hardware-efficient internal folded multilevel 2-D DWT architecture without complex multi-clock processing and complex inter-level data adjustment.

Lifting Scheme
Data Scanning Method
Unfolded Architecture
Proposed Multi-Level DWT Architecture
H2 H1 H0
Hardware Estimation
Performance Comparison
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call