Abstract
The performance of signal-processing algorithms implemented in hardware depends on the efficiency of datapath, memory speed and address computation. Pattern of data access in signal-processing applications is complex and it is desirable to execute the innermost loop of a kernel in a single-clock cycle. This necessitates the generation of typically three addresses per clock: two addresses for data sample/coefficient and one for the storage of processed data. Most of the Reconfigurable Processors, designed for multimedia, focus on mapping the multimedia applications written in a high-level language directly on to the reconfigurable fabric, implying the use of same datapath resources for kernel processing and address generation. This results in inconsistent and non-optimal use of finite datapath resources. Presence of a set of dedicated, efficient Address Generator Units (AGUs) helps in better utilisation of the datapath elements by using them only for kernel operations; and will certainly enhance the performance. This article focuses on the design and application-specific integrated circuit implementation of address generators for complex addressing modes required by multimedia signal-processing kernels. A novel algorithm and hardware for AGU is developed for accessing data and coefficients in a bit-reversed order for fast Fourier transform kernel spanning over log 2 N stages, AGUs for zig-zag-ordered data access for entropy coding after Discrete Cosine Transform (DCT), convolution kernels with stored/streaming data, accessing data for motion estimation using the block-matching technique and other conventional addressing modes. When mapped to hardware, they scale linearly in gate complexity with increase in the size.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have