Abstract

Digit-recurrence algorithms are widely used in actual microprocessors to compute floating-point division and square root. These iterative algorithms present a good trade-off in terms of performance, area and power. We present a floating-point division and square root unit, which implements a radix-64 floating-point division and a radix-16 floating-point square root. To have an affordable implementation, each radix-64 division iteration and radix-16 square root iteration are made of simpler radix-4 iterations: 3 radix-4 iterations in division and 2 in square root. Speculation is used between consecutive radix-4 iterations to get a reduced timing. There are three different parts in digit-recurrence implementations: initialization, digit iterations, and rounding. The digit iteration is the iterative part and it uses the same logic for several cycles. Division and square root share partially the initialization and rounding stages, whereas each one has different logic for the digit iterations. The result is a low-latency floating-point divider and square root, requiring 11, 6, and 4 cycles for double, single and half-precision division with normalized operands and result, and 15, 8 and 5 cycles for square root. One or two additional cycles are needed in case of subnormal operand(s) or result.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.