Abstract

An architecture of a programmable systolic array processor is proposed for the discrete wavelet transform (DWT). This transform requires a huge amount of data to be filtered. To achieve this, many processor elements (PEs) are implemented. However, the hardware of a multiplier for multiply-accumulate operations is large, and complicated connections among PEs lower flexibility and scalability. By using the time-divided multiple-operation method, the execution unit with a simple structure of shifters and a three-input adder achieved 50% of hardware size and the same performance of that achieved with a multiplier and an adder. The unique network mechanism among PEs and the systolic array architecture provided a high level of data transfer, flexibility, and scalability. Using this architecture enables a processor with ten PEs to execute DWT for 1024×1024 image pixels in 26.3 ms.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.