A multi-level 2D Discrete wavelet transform (DWT) architecture for JPEG2000 is proposed, enhancing speed through parallel processing multiple tile blocks. Based on the lifting scheme, folded architecture and unfolded architecture achieving critical path delay with only one multiplier are designed to increase throughput rate. Connecting the folded and unfolded architecture through a pipeline architecture ensures uniform throughput rates across all DWT levels within a singular clock domain. Computational resource consumption is reduced by adjusting the timing to allow one folded architecture to process three tile blocks of three to five levels of DWT, and a transposing module requiring merely six registers is devised to decrease storage resource consumption. The quantization module, crucial for code-word control in JPEG2000, is integrated into the scaling module with minimal additional resource expenditure. Compared to the existing architecture, the analysis demonstrates that the proposed architecture exhibits enhanced hardware efficiency, with a reduction in transistor-delay-product (TDP) of no less than 14.69%. Synthesis results further reveal an area reduction of at least 26.64%, and a decrease in area-delay-product (ADP) by a minimum of 29.89%. Results from FPGA implementation indicate a significant decrease in resource utilization.
Read full abstract