Abstract
A reduced-complexity convolutional formulation is presented for systolic implementation of the discrete cosine transform, where N-point transform can be computed by four numbers of nearly (N/4)-point circular-convolution-like operations. The proposed algorithm not only provides a reduction of computational complexity by four times over the conventional formulation, where N-point transform is computed via (N-1)-point cyclic convolution, but also leads to concurrent pipelined execution in linear systolic arrays. It is shown that the multiplications in the processing elements can be implemented by lookup-tables using dual-port ROM. Two variants of systolic structures using ROM-based multipliers are presented for efficient implementation of the proposed algorithm. The proposed structures are found to offer significant saving of hardware, require less latency, and yield more throughput over the existing structures. Apart from simplicity and regularity, the proposed structures would also have flexibility of implementation by CORDIC circuits and canonical-signed-digit-based multipliers as well
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have