Low‐precision DSP‐based floating‐point multiply‐add fused for Field Programmable Gate Arrays

Alexandru Amaricai,Oana Boncalo,Constantina‐Elena Gavriliu

doi:10.1049/iet-cdt.2013.0128

Abstract

Floating-point (FP) multiply-add fused ( F 1 * F 2 ±F 3 ) and multiply-accumulate represent the most common arithmetic operation in a wide range of applications, such as graphic processing, multimedia or FP digital signal processing (DSP). This study proposes FP multiply-add fused units for low-precision formats (IEEE 16-bit half precision or the 32-bit single precision) which rely on modern Field Programmable Gate Array (FPGA) features such as the available integer multiply-accumulate-based support built-in the FPGA DSP blocks. These are employed as building-blocks within the mantissa data-path processing for the multiplication and the add/subtract operations. In order to use the DSP block for these operations, the alignment right shifts are performed before the multiply-add stage: a right shift on the addend, and, a right shift for one of the multiplicands. This results in efficient DSP usage; thus both cost savings and higher performance (high working frequencies and low latencies) are obtained for the multiply-add fused operation.

Full Text