Abstract

The fused-multiply-add (FMA) instruction is a common instruction in RISC processors since 1990. A 3-stage, 8-level pipelined, dual-precision FMA is proposed here that can perform operations either at one double precision (SISD) or at two single precision in parallel (SIMD). The 53-bit mantissa-multiplier (MM) is optimally segmented by Karatsuba–Offman (KO) algorithm such that both modes can be performed. The 6-stage pipelined MM uses only 6 of 10 multipliers and 13 of 33 adder/subtractors in SIMD. Thus hardware area of the proposed MM is reduced by 23.82% and throughput is maintained to be 923M samples/s. The arithmetic operational units in the data path are shared among the modes by having four data rearrangement units (DRU) which rearranges the data systematically at the input, the outputs of MM and the final output. Though these DRUs bring some hardware overhead, the resulting architecture is modular and uniform for both modes of computation. The proposed FMA has been implemented using TSMC 1P6M CMOS 130 nm library and takes 48% less overall area and consumes 49% less power at 308.7 MHz compared to previous results. The area-delay-product (ADP), 0.48 × 10−15 shows that the area optimization by proposed KO based MM can also keep the computation time as 3.24 ns.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.