Abstract

Jointly computing the square root (SQRT) and the inverse square root (ISQRT) of floating-point numbers is common in many algorithms, e.g., in image or time series data processing when computing norms or vector normalization. Existing designs suffer from high latency and inefficient resource utilization due to the separate architectures that carry out these two operations. In this paper, we first propose a non-iterative approximation method for computing SQRT and ISQRT based on the Chebyshev min-max criterion to reduce the latency while meeting the accuracy requirements of various applications; thereafter a shared architecture of these two operations is designed and implemented in FPGA with less logic units. In contrast with other approximation solutions, our method does not need to perform any iterations and the accuracy can be mathematically estimated. A comparison with vendor-provided IP cores for FPGAs revealed that our proposed SQRT/ISQRT floating-point IP core utilizes less resources while reducing the clock-cycle latency by nearly four times.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call