Abstract
In this paper, a novel internal folded hardware-efficient architecture of multi-level 2-D 9/7 discrete wavelet transform (DWT) is proposed. For multi-level DWT, the unfolded structure is more extensively used compared with the folded structure, because of its low memory consumption and low time delay. However, a set of input data valid every few clock cycles caused the mismatch between clock and data in the unfolded structure. The mismatch usually needs to be solved by multi-clock or complex data adjustment, which increases the consumption of hardware resources and the complexity of the overall system. To solve the above problem of the unfolded structure, we adjust the data input timing by using a single clock domain and folding the DWT architecture of different levels in varying degrees, according to their own clock-to-data ratios. For an image of size of N × N pixels and 3-level DWT, the proposed architecture requires only 6N words temporal memory. For 3-level DWT with an image of size 512 × 512 pixels, the hardware estimation and comparison of the existing architectures show that, the hardware estimation result shows at least 30.6% area-delay-product (ADP) decrease, and at least 22.4% transistor-delay-product (TDP) decrease for S = 8, and 25.77% transistor-delay-product (TDP) decrease for S = 16.
Highlights
The discrete wavelet transform (DWT), as a multi-resolution analysis tool, is commonly used for image analysis, image compression and digital signal processing
In order to achieve the right clock-to-data ratio and meet the order for the data flow required by the row filter, the transposing buffer is needed in each 2-D DWT
On the assumption that the input image is N × N with 8-bit depth, the hardware consumption of the entire 3-level 2-D DWT architecture is listed in Table 1, where 1:1, 2:1 and4:1 represent the structures with 1:1, 2:1 and 4:1 clock-to-data ratio respectively, the clock-to-data ratio represents the number of clock cycles it takes to get a valid LL component of data, and S represents the parallelism
Summary
The discrete wavelet transform (DWT), as a multi-resolution analysis tool, is commonly used for image analysis, image compression and digital signal processing. [12] presented a scalable parallel architecture of multi-level 2-D DWT based on lifting scheme Temporal memory of this architecture is reduced to zero in the first level, by overlapping seven pixels. [15] used an innovative block based Z types memory scanning method of their own way for reducing the total processing time, but it’s not a multi-level architecture. From the researches of the existing 2-D DWT architectures, it can be observed that, compared with the folded structure, the unfolded structure has smaller critical path delay and lower requirement for external memory accesses. Based on the lifting scheme, we attempt to develop a high-throughput and hardware-efficient internal folded multilevel 2-D DWT architecture without complex multi-clock processing and complex inter-level data adjustment.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.