Abstract
Recent applications using arithmetic computations such as multiplication and division are often used in video/image processing and ML (machine learning). DSP blocks that are available on FPGAs are high-performance multipliers that may be used to accelerate computation. These multipliers are less and have fixed placement on FPGAs. Also, they generate additional routing delays which are inefficient for lower bit width multiplications. This results in greater power consumption. Soft IP cores that are specifically intended for multiplication are supplied as an optional feature by FPGA vendors to their customers. It improves the performance while using less power, but these IP cores need further improvement. This has led to the development of general low-latency, reduced area, and accurate soft-core multiplier designs based on the architectural properties of FPGAs, such as lookup table (LUT) structures and rapid carry chains. This work aims to build accurate and approximate signed and unsigned multipliers for 8-bit configurations. It utilizes a single LUT5 with multiplexers instead of double LUT5 in LUT6 architecture. To sum it up, this architecture was designed in Verilog HDL and synthesized in Xilinx software. In this work, for a signed multiplier, a notable decrease in area and delay by 42.69% and 11.79% was observed and for an unsigned multiplier, 43.8% and 6.61% reduction were attained.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have