Abstract

Most digital signal processing (DSP) algorithms for multimedia and communication applications require multiplication and addition operations. Especially matrix-matrix or matrix-vector multiplication are frequently used in DSP algorithms needs inner product arithmetic which takes most processing time. Also multiplications for the DSP algorithms have different input bitwidths. Therefore, the multiplications for inner product need to be sufficiently flexible in terms of bitwidths to utilize the multiplication resources efficiently. This paper proposes a novel reconfigurable inner product architecture thas is using a pipelined adder array, which features increased flexibility in bitwidths of input arrays. The proposed architecture consists of sixteen 4x4 multipliers and a pipelined adder array and can compute the inner product of input arrays with any combination of multiples of 4 bitwidths such as 4x4, 4x8, 4x12, ...16x16. Experimental results show that the proposed architecture has latency of maximum 9 clock cycles and the throughput of 1 clock cycle for inner product of various bitwidths of input arrays. When TSMC 0.18 um libraries are used, the chip area and power dissipation of the proposed architecture are 332162 (nand gates) and 3.46 mW, respectively. The proposed architecture can be applied to a reconfigurable arithmetic engine for real-time DSP applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call