Abstract

We present a new parallel integer multiplier generator for FPGAs. It combines (i) a new Generalized Parallel Counter (GPC) grouping algorithm for column compression with (ii) a LUT based partial product generation, is (iii) unique as it automatically generates placement pragmas, (iv) uses a ternary adder as a final adder to exploit FPGA's internal carry-chains, and (v) employs a novel GPC based row compression, which aims to reduce the width of the final adder. We wrote Verilog generators for our method as well as one leading work in the literature. For synthesis, we wrote a script that can do “binary search” for the optimum latency. Our extensive implementation results on Xilinx Virtex-6 FPGAs show that we almost always produce circuits with smaller latency (i.e., timing) and Area-Timing Product (ATP) compared to the state-of-the-art in the literature, by 18% and 12% (on the average), respectively. We also offer smaller latency compared to the HDL * operator by 9% on the average at a cost of 12% larger ATP on the average. We are worse in latency in 6 cases out of 33, in all of which synthesis maps * to DSP slices. We also include area and energy results on Virtex-6 as well as a limited amount of latency, area, and ATP results on Virtex-5 and Altera Stratix III.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call