Abstract

The performance of signal-processing algorithms implemented in hardware depends on the efficiency of datapath, memory speed and address computation. Pattern of data access in signal-processing applications is complex and it is desirable to execute the innermost loop of a kernel in a single-clock cycle. This necessitates the generation of typically three addresses per clock: two addresses for data sample/coefficient and one for the storage of processed data. Most of the Reconfigurable Processors, designed for multimedia, focus on mapping the multimedia applications written in a high-level language directly on to the reconfigurable fabric, implying the use of same datapath resources for kernel processing and address generation. This results in inconsistent and non-optimal use of finite datapath resources. Presence of a set of dedicated, efficient Address Generator Units (AGUs) helps in better utilisation of the datapath elements by using them only for kernel operations; and will certainly enhance the performance. This article focuses on the design and application-specific integrated circuit implementation of address generators for complex addressing modes required by multimedia signal-processing kernels. A novel algorithm and hardware for AGU is developed for accessing data and coefficients in a bit-reversed order for fast Fourier transform kernel spanning over log 2 N stages, AGUs for zig-zag-ordered data access for entropy coding after Discrete Cosine Transform (DCT), convolution kernels with stored/streaming data, accessing data for motion estimation using the block-matching technique and other conventional addressing modes. When mapped to hardware, they scale linearly in gate complexity with increase in the size.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call