Many VLSI architectures for computing the discrete wavelet transform (DWT) were presented, but the parallel input data sequence and the programmability of the 2-D DWT were rarely mentioned. In this paper, we present a parallel-processing VLSI architecture to compute the programmable 2-D DWT, including various wavelet filter lengths and various wavelet transform levels. The proposed architecture is very regular and easy for extension. To eliminate high frequency components, the pixel values outside the boundary of the image are mirror-extended as the symmetric wavelet transform (SWT) and the mirror-extension is realized via the routing network. Owing to the property of the parallel processing, we adopt the row-based recursive pyramid algorithm (RPA), similar to 1-D RPA, as the data scheduling. This design has been implemented and fabricated in a 0.35 mm 1P4M CMOS technology and the working frequency is 50 MHz. The chip size is about 5200 mm × 2500 mm. For a 256 × 256 image, the chip can perform 30 frames per second with the filter length varying from 2 to 20 and with various levels. The proposed architecture is suitable for real-time applications such as JPEG 2000.
Read full abstract