Mantissa Multiplication Research Articles

The use of artificial intelligence (AI) in sensor analytics is entering a new era based on the use of ubiquitous embedded connected devices. This transformation requires the adoption of design techniques that reconcile accurate results with sustainable system architectures. As such, improving the efficiency of AI hardware engines as well as backward compatibility must be considered. In this paper, we present the Hybrid-Float6 (HF6) quantization and its dedicated hardware design. We propose an optimized multiply-accumulate (MAC) hardware by reducing the mantissa multiplication to a multiplexor-adder operation. We exploit the intrinsic error tolerance of neural networks to further reduce the hardware design with approximation. To preserve model accuracy, we present a quantization-aware training (QAT) method, which in some cases improves accuracy. We demonstrate this concept in 2D convolution layers. We present a lightweight tensor processor (TP) implementing a pipelined vector dot-product. For compatibility and portability, the 6-bit floating-point (FP) is wrapped in the standard FP format, which is automatically extracted by the proposed hardware. The hardware/software architecture is compatible with TensorFlow (TF) Lite. We evaluate the applicability of our approach with a CNN-regression model for anomaly localization in a structural health monitoring (SHM) application based on acoustic emission (AE). The embedded hardware/software framework is demonstrated on XC7Z007S as the smallest Zynq-7000 SoC. The proposed implementation achieves a peak power efficiency and run-time acceleration of 5.7 GFLOPS/s/W and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$48.3\times $ </tex-math></inline-formula> , respectively.

Many applications, such as machine learning and sensor data analysis, are statistical in nature and can tolerate some level of inaccuracy in their computation. Approximate computing is a viable method to save energy and increase performance by controllably trading off energy for accuracy. In this paper, we propose a tiered approximate floating point multiplier, called CFPU, which significantly reduces energy consumption and improves the performance of multiplication at a slight cost in accuracy. The floating point multiplication is approximated by replacing the costly mantissa multiplication step of the operation with lower energy alternatives. We process the data by using one of the three modes: a basic approximate mode, an intermediate approximate mode, or on the exact hardware, depending on the accuracy requirements. We evaluate the efficiency of the proposed CFPU on a wide range of applications including twelve general OpenCL ones and three machine learning applications. Our results show that using the first CFPU approximation mode results in $3.5\times $ energy-delay product (EDP) improvement, compared to a GPU using traditional floating point units (FPUs), while ensuring less than 10% average relative error. Adding the second mode further increases the EDP improvement to $4.1\times $ , compared to an unmodified FPU, for less than 10% error. In addition, our results show that the proposed CFPU can achieve $2.8\times $ EDP improvement for multiply operations as compared to state-of-the-art approximate multipliers.

Mantissa Multiplication Research Articles

Related Topics

Articles published on Mantissa Multiplication

An energy‐efficient floating‐point compute SRAM with pipelined in‐memory bit‐parallel exponent and bitwise mantissa processing

Design of Power Efficient Posit Multiplier using Compressor Based Adder

All-Digital Computing-in-Memory Macro Supporting FP64-Based Fused Multiply-Add Operation

An area-delay efficient single-precision floating-point multiplier for VLSI systems

CNN Sensor Analytics With Hybrid-Float6 Quantization on Low-Power Embedded FPGAs

Approximate Floating-Point Multiplier based on Static Segmentation

Logarithm-approximate floating-point multiplier

Quantum-dot cellular automata based design of multifunction binary right shifter circuit

Improving Power of DSP and CNN Hardware Accelerators Using Approximate Floating-point Multipliers

Multi‐precision binary multiplier architecture for multi‐precision floating‐point multiplication

Improvised hierarchy of floating point multiplication using 5:3 compressor

Design of Power Efficient Posit Multiplier

Implementation of 64 Bit Complex Floating-Point Multiplier on FPGA using Vedic Mathematics Sutra- Urdhva Tiryagbhyam

Runtime Efficiency-Accuracy Tradeoff Using Configurable Floating Point Multiplier

Reconfigurable half-precision floating-point real/complex fused multiply and add unit

Reconfigurable half-precision floating-point real/complex fused multiply and add unit

Design of quadruple precision multiplier architectures with SIMD single and double precision support

Resource Efficient Single Precision Floating Point Multiplier Using Karatsuba Algorithm

Resource Efficient Single Precision Floating Point Multiplier Using Karatsuba Algorithm

Efficient dual-precision floating-point fused-multiply-add architecture

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Mantissa Multiplication Research Articles

Related Topics

Articles published on Mantissa Multiplication

An energy‐efficient floating‐point compute SRAM with pipelined in‐memory bit‐parallel exponent and bitwise mantissa processing

Design of Power Efficient Posit Multiplier using Compressor Based Adder

All-Digital Computing-in-Memory Macro Supporting FP64-Based Fused Multiply-Add Operation

An area-delay efficient single-precision floating-point multiplier for VLSI systems

CNN Sensor Analytics With Hybrid-Float6 Quantization on Low-Power Embedded FPGAs

Approximate Floating-Point Multiplier based on Static Segmentation

Logarithm-approximate floating-point multiplier

Quantum-dot cellular automata based design of multifunction binary right shifter circuit

Improving Power of DSP and CNN Hardware Accelerators Using Approximate Floating-point Multipliers

Multi‐precision binary multiplier architecture for multi‐precision floating‐point multiplication

Improvised hierarchy of floating point multiplication using 5:3 compressor

Design of Power Efficient Posit Multiplier

Implementation of 64 Bit Complex Floating-Point Multiplier on FPGA using Vedic Mathematics Sutra- Urdhva Tiryagbhyam

Runtime Efficiency-Accuracy Tradeoff Using Configurable Floating Point Multiplier

Reconfigurable half-precision floating-point real/complex fused multiply and add unit

Reconfigurable half-precision floating-point real/complex fused multiply and add unit

Design of quadruple precision multiplier architectures with SIMD single and double precision support

Resource Efficient Single Precision Floating Point Multiplier Using Karatsuba Algorithm

Resource Efficient Single Precision Floating Point Multiplier Using Karatsuba Algorithm

Efficient dual-precision floating-point fused-multiply-add architecture