As an important part of many processors's floating point unit, fused multiply-add unit performs a multiplication followed immediately by an addition. In IBM POWER6 microprocessor's fused multiply-add unit, a fast 128-bit floating-point end-around-carry (EAC) adder is proposed. Very few algorithmic details exist in today's literature about this adder. In this study, a complete designed EAC adder that can work independently as a regular adder is proposed. Details about the proposed EAC adder's arithmetic algorithms are described. In IBM's original EAC adder, the Kogge'Stone tree has been chosen for its high performance on ASIC technology. In this study, the authors present a comparative study on different parallel prefix trees which are used in the design of our new EAC adder targeting field programmable gate array (FPGA) technology. Our study highlights the main performance differences among 14 different architecture configurations focusing on the area requirements and the critical path delay. The experimental results show that there is one architecture configuration with the lower area requirement and the higher performance.
Read full abstract