Abstract

Sweeney, Robertson and Tocher (SRT) algorithm is a common and efficient way for division and square root (div/sqrt). We present to overlap two iterations into one cycle by predicting remainder and quotient. To reduce latency, redundant representation is used superiorly, as well as the use of a minimum redundancy factor. Division and square root can be integrated into one unit which causes a reduction in hardware cost. With 40nm technology library, the area of our architecture after layout design, is 37795µm2, the power is 81.19mW and the delay is only 656ps. The cycles for double-precision division and square root are 17 and 16, respectively. Experiments show our architecture achieves small latency and high frequency, together with modest area and power.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call