"Approximate Multipliers Based on Low-Power 4:2 Compressors for Error-Tolerant Applications"

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Energy-efficient and high-performance general- purpose compute engines, as well as application specific integrated circuits, are highly demanded to facilitate the development of artificial intelligence and big data processing applications. However, with the end of Dennard’s scaling and Moore’s law it is becoming difficult to handle massive amounts of data and complex computations required in these applications. Approximate computing (AC) has emerged as an attractive paradigm in the digital design to address this unprecedented challenge. AC is driven by the observation that many state-of-the-art applications, such as classification, machine learning, data mining, robotics and communication, exhibit error-tolerant characteristics; therefore, a small amount of error (trades off the requirement of exact computation) can be introduced to achieve area, power, and speed benefits. AC techniques can be applied at both the software and hardware layers. At the hardware layer, arithmetic units (multipliers, adders, and dividers) are considered as hardware computational modules. Therefore, the approximation at hardware layer has been focused around the design of approximate arithmetic units. This paper presents approximate multipliers based on novel 4:2 compressors for error-tolerant applications. The proposed 4:2 compressors exhibit zero-mean error behavior while having a comparable hardware utilization with the existing state-of the-art designs. The hardware-efficient as well as the error-efficient designs of variable accuracy-power has been investigated to explore the maximum trade-off. All the designs are synthesized using Cadence Genus synthesis tool (TSMC 65 nm technology) and power is reported using Cadence Joules RTL power solution. A comprehensive error analysis is performed using well-known error metrics such as error distance (ED), mean average distance (MED), mean relative error distance (MRED) and normalized mean error distance (NMED). Moreover, all the designs are also compared with respect to power-delay product (PDP) and MRED to apprehend which designs are lying on the error-energy Pareto- optimal curve. A case study is also presented to demonstrate the applicability of the proposed designs in practical image processing application.

Similar Papers
  • Research Article
  • Cite Count Icon 11
  • 10.1016/j.memori.2022.100017
High-performance, energy-efficient, and memory-efficient FIR filter architecture utilizing 8x8 approximate multipliers for wireless sensor network in the Internet of Things
  • Oct 12, 2022
  • Memories - Materials, Devices, Circuits and Systems
  • Charles Rajesh Kumar J + 2 more

High-performance, energy-efficient, and memory-efficient FIR filter architecture utilizing 8x8 approximate multipliers for wireless sensor network in the Internet of Things

  • Research Article
  • Cite Count Icon 28
  • 10.1109/les.2022.3192530
High Efficient GDI-CNTFET-Based Approximate Full Adder for Next Generation of Computer Architectures
  • Mar 1, 2023
  • IEEE Embedded Systems Letters
  • Ayoub Sadeghi + 3 more

Approximate computing (AC) is an emerging technique in arithmetic circuits. In this letter, a new AC-based full adder (FA) circuit is presented with 12 transistors, 150mm]Please confirm or add details for any funding or financial support for the research of this article. -160mm]If you haven’t done so already, please make sure you have submitted a video graphical abstract (GA) for your paper, along with a caption and overlay image. The GA will be displayed on your articles abstract page on IEEE Xplore. Note that captions cannot exceed 1800 characters (including spaces). Overlay images are usually a screenshot of your video that best represents the video. This is for readers who may not have access to video-viewing software. 0.239 <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mu \text{m}^{2}$ </tex-math></inline-formula> of area, and two errors in the outputs. In the proposed FA, the gate diffusion input (GDI) and dynamic-threshold (DT) techniques are applied using the carbon nanotube field-effect transistor (CNTFET) technology. Accuracy metrics, such as normalized mean error distance (NMED) and mean relative error distance (MRED) along with circuitry parameters of power-delay-product (PDP), energy-delay-product (EDP), and power-delay-area-product (PDAP), confirm the efficiency of the proposed FA for complex structures. The proposed FA is embedded in a ripple carry adder (RCA) by various numbers of approximate bits (NABs), and then the accuracy and circuitry parameters are extracted. Compared to the state-of-the-art designs, the high-efficient behavior of the proposed FA is proved when it is used in image processing applications.

  • Conference Article
  • Cite Count Icon 15
  • 10.1109/sips47522.2019.9020404
Design and Evaluation of a Power-Efficient Approximate Systolic Array Architecture for Matrix Multiplication
  • Oct 1, 2019
  • Haroon Waris + 3 more

Matrix multiplication (MM) is a basic operation for many Digital Signal Processing applications. A Systolic Array (SA) is often considered as one of the most favorable architecture to achieve high performance for matrix multiplication. In this paper, the design exploration for an approximate SA is pursued; three design schemes are proposed by introducing approximation in multiple sub-modules. An approximation factor $\alpha$ is introduced; it is related to the inexact columns in the SA to explore the accuracy-efficiency trade-off present in the proposed designs. In the evaluation, an 8-bit input operand matrix multiplication is considered; the Synopsys Design Compiler at 45nm technology node is used to establish hardware-related metrics. The Error Rate (ER), Normalized Mean Error Distance (NMED) and Mean Relative Error Distance (MRED) are used as figures of merit for error analysis. Results show that the proposed architecture for 8-bit matrix multiplication with an approximation factor $\alpha=7$ has the lower power consumption compared to existing inexact designs found in the technical literature with comparable NMED. In addition, a power delay product vs NMED analysis shows the proposed designs have a lower PDP so applicable to low power applications. The practicality of the proposed architecture is established by computing the Discrete Cosine Transform.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/ises52644.2021.00045
Fast Booth Multipliers Using Approximate 4:2 Compressors
  • Dec 1, 2021
  • S Sreeparvathy + 3 more

Approximate computing is a computational model targeted at high speed, low power execution of error-resilient applications. In this paper, we design and implement the RTL model of four variants of $8\times 8$ booth and modified booth (Radix2) multipliers using 2 different designs of approximate 4:2 compressors. Matlab models of the variants are implemented and compared with the RTL models based on three quality metrics namely mean error distance (MED), mean relative error distance (MRED), and normalized mean error distance (NMED). RTL models of the multipliers are implemented using Verilog and synthesized using Xilinx Zynq-7000 FPGA from Xilinx as the target device. The different variants are compared in terms of the quality metrics, resource utilization, delay and power dissipation. The comparison shows that our implementation gives similar performance in terms of delay and resource utilization with lower error rate as that of state-of-the-art implementations.

  • Research Article
  • 10.1587/elex.22.20250507
Adder-free dynamic compensation for logarithmic multipliers based on minimum worst-case error
  • Dec 10, 2025
  • IEICE Electronics Express
  • Yiqi Zhou + 4 more

Logarithmic multipliers offer hardware efficiency but suffer from significant errors. This brief proposes a high-accuracy design using a WCE-minimizing compensation algorithm that dynamically selects the larger operand for optimal scaling. The resulting compensation value enables direct error correction without additional adders. Zero-padding exploitation facilitates bit-width truncation, reducing barrel shifter and adder complexity while preserving accuracy. Compared to prior designs, the multiplier achieves minimal normalized mean error distance (NMED) and mean relative error distance (MRED) with near-optimal power-delay product (PDP), establishing an optimal accuracy-efficiency tradeoff. Additionally, it induces a double-sided error distribution that mitigates excessive error accumulation in multiply-accumulate applications.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 53
  • 10.3390/electronics9030471
Design and Analysis of an Approximate Adder with Hybrid Error Reduction
  • Mar 11, 2020
  • Electronics
  • Hyoju Seo + 2 more

This paper presents an energy-efficient approximate adder with a novel hybrid error reduction scheme to significantly improve the computation accuracy at the cost of extremely low additional power and area overheads. The proposed hybrid error reduction scheme utilizes only two input bits and adjusts the approximate outputs to reduce the error distance, which leads to an overall improvement in accuracy. The proposed design, when implemented in 65-nm CMOS technology, has 3, 2, and 2 times greater energy, power, and area efficiencies, respectively, than conventional accurate adders. In terms of the accuracy, the proposed hybrid error reduction scheme allows that the error rate of the proposed adder decreases to 50% whereas those of the lower-part OR adder and optimized lower-part OR constant adder reach 68% and 85%, respectively. Furthermore, the proposed adder has up to 2.24, 2.24, and 1.16 times better performance with respect to the mean error distance, normalized mean error distance (NMED), and mean relative error distance, respectively, than the other approximate adder considered in this paper. Importantly, because of an excellent design tradeoff among delay, power, energy, and accuracy, the proposed adder is found to be the most competitive approximate adder when jointly analyzed in terms of the hardware cost and computation accuracy. Specifically, our proposed adder achieves 51%, 49%, and 47% reductions of the power-, energy-, and error-delay-product-NMED products, respectively, compared to the other considered approximate adders.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 50
  • 10.1109/access.2021.3108443
A Novel Approximate Adder Design Using Error Reduced Carry Prediction and Constant Truncation
  • Jan 1, 2021
  • IEEE Access
  • Jungwon Lee + 3 more

This paper proposes a novel approximate adder that exploits an error-reduced carry prediction and constant truncation with error reduction schemes. The proposed adder design techniques significantly improve overall computation accuracy while providing excellent hardware efficiency. Particularly, the proposed carry prediction technique can reduce a prediction error rate by up to 75% compared to existing approximate adders considered in this paper. Furthermore, the error reduction technique also enhances the overall computation accuracy by decreasing the error distance (ED). Our experimental results show that the proposed adder improves the normalized mean ED (NMED) and mean relative ED (MRED) by up to 91.4% and 98.9%, respectively, compared to the other approximate adders. Importantly, an excellent design tradeoff allows the proposed adder to be the most competitive of the adders under consideration. Specifically, the proposed adder achieves up to 95.7%, 91.1%, and 93.2% reductions of the power-NMED, energy-NMED, and area-delay product (ADP)-NMED products, respectively, compared to the other adders. Our adder enhances the power-, energy-, and ADP-MRED products by up to 99.4% compared to the others. In particular, the figure of merit (FoM) considering both hardware and accuracy of the proposed adder is up to 99.95% smaller than that of the other approximate adders considered herein. Furthermore, we confirm that the approximation errors caused by the proposed adder have very little impact on output quality when adopted in practical applications, such as digital image processing and machine learning.

  • Research Article
  • Cite Count Icon 11
  • 10.5573/ieiespc.2019.8.4.324
An Accuracy Enhanced Error Tolerant Adder with Carry Prediction for Approximate Computing
  • Aug 31, 2019
  • IEIE Transactions on Smart Processing &amp; Computing
  • Yongtae Kim

This paper presents a new approximate adder design to improve the computation accuracy of the conventional error tolerant adder by leveraging a carry prediction technique with a sum generator. The proposed carry speculation scheme exploits inputs from a single bit position and effectively increase the bit width of the accurate addition. Implemented in a 65-nm CMOS technology, the proposed approximate adder is up to two times faster than, and twice as power efficient as, the traditional adders. Compared to the other approximate adders considered in this paper, the proposed adder achieves up to 3.7%, 15.5%, 79.9% and 79.9% reductions in the error rate (ER), mean relative error distance (MRED), mean error distance (MED) and normalized MED (NMED) respectively, at an extra cost of merely 4% to 6% in area, delay, and power. In addition, the proposed adder offers a good tradeoff between power/energy and accuracy and improves on power/energy-NMED products by up to 46%, outperforming other approximate adders.

  • Research Article
  • Cite Count Icon 15
  • 10.1109/tcsi.2022.3167894
An Energy-Efficient Approximate Divider Based on Logarithmic Conversion and Piecewise Constant Approximation
  • Jul 1, 2022
  • IEEE Transactions on Circuits and Systems I: Regular Papers
  • Yong Wu + 8 more

Approximate computing (AC) has been considered as a promising paradigm to improve the energy-efficiency of computing hardware for error-tolerant applications, with negligible quality degradation to the output. Dividers frequently limit the performance of a computing system; however, they have not received as much attention as multipliers and adders in AC. In this paper, an energy-efficient and high-performance approximate divider is proposed based on logarithmic conversion and piecewise constant approximation. In this design, the range for the conversion between binary and logarithmic numbers is first expanded from <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathbf {[{0,1}]}$ </tex-math></inline-formula> to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathbf {[-0.5,1]}$ </tex-math></inline-formula> . A heuristic search algorithm is then devised to find the most accurate constant set to approximate the reciprocal of the divisor, by minimizing a statistical error. The hardware implementation is presented for both floating-point (FP) and integer dividers. With a high configurability, the proposed divider results in a mean relative error distance (MRED) from 2.78% to 0.046%, indicating a high accuracy among state-of-the-art approximate dividers. Compared to the half-precision FP divider, the proposed divider with a MRED of 0.74% can achieve nearly <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathbf {90\times }$ </tex-math></inline-formula> improvement in PDP. Moreover, compared to state-of-the-art approximate dividers, the proposed design is in the Pareto Frontier in terms of power delay product (PDP) and MRED. The three image processing application results demonstrate that the proposed divider can result in the highest peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) even with truncation.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/pcems58491.2023.10136063
Design Of Wallace Multiplier Using Novel Approximate 4:2 Compressors
  • Apr 5, 2023
  • Srinivas Pavan Jonnalagadda + 4 more

In this study, we suggested an approximation multiplier that employs an approximate 4-2 compressor and is energy-efficient. When compared to the current designs, the suggested compressor has a small area. The results of simulations reveal that the suggested approximation multipliers display a reasonable decrease in Mean Error Distance, Mean Relative Error Distance, Normalized Mean Error Distance, compared to multiplier that is designed with exact compressors. The Power, Delay and Area of multipliers developed with this approximate compressor is superior to that obtained with previously suggested approximate compressors, according to implementation results in 90nm CMOS technology.

  • Research Article
  • 10.1088/1402-4896/ae1672
A noise-resilient and configurable approximate quantum multiplier for enhanced computation fidelity on NISQ devices
  • Nov 1, 2025
  • Physica Scripta
  • Sungyoun Hwang + 3 more

Quantum arithmetic circuits, such as adders and multipliers, are essential for many quantum algorithms, but their practical deployment on noisy intermediate-scale quantum (NISQ) devices remains challenging due to limited coherence times and high gate error rates. In this paper, we propose Qaradox, a configurable approximate quantum multiplier architecture that achieves higher computation accuracy than its exact counterpart under realistic quantum noise. The proposed architecture introduces a novel class of controlled quantum adders that enable flexible combinations of exact, approximate, and truncated operations, allowing the quantum circuit to be tailored to hardware-specific noise characteristics. Experimental results using IBM’s 127-qubit Brisbane noise model show that Qaradox reduces the error rate (ER), normalized mean error distance (NMED), and mean relative error distance (MRED) by up to 23.7%, 80.3%, and 88.1%, respectively, compared to the fully exact multiplier. Furthermore, when applied to image sharpening, the proposed architecture improves peak signal-to-noise ratio (PSNR) by more than three times and increases structural similarity index measure (SSIM) from 0.00 to 0.83, effectively recovering visual quality lost in exact designs. These results demonstrate that approximation, when applied structurally and selectively, can enhance both robustness and correctness in quantum arithmetic for NISQ-era systems.

  • Conference Article
  • Cite Count Icon 9
  • 10.1109/dft.2017.8244437
Simulation-based evaluation of frequency upscaled operation of exact/approximate ripple carry adders
  • Oct 1, 2017
  • H Junqi + 3 more

This paper presents a simulation-based evaluation of approximate (inexact) and exact adder cells and ripple carry adders (RCA) using a frequency upscaling technique. In the proposed method at a constant supply voltage, the frequency of the inputs applied to an adder cell is increased (upscaled) beyond its largest operating value thereby generating errors in the addition operation. In this paper, exact/inexact full adder cells (mirror adder and AMA1) are initially operated under frequency upscaling at different feature sizes for the transistors in the circuits. The effects of process variations (such as gate length and supply voltage) are also analyzed with respect to the frequency upscaling process. An exhaustive simulation under frequency upscaling for 4 and 8 bits RCAs using exact/inexact cells is then pursued. It is observed that the inexact adder sustains a higher (1.3 times) frequency operation and lower energy dissipation (50% reduction) compared with an exact adder. Also the normalized mean error distance (NMED) and the mean relative error distance (MRED) of the inexact and exact RCAs are very close.

  • Research Article
  • 10.37391/ijeer.130408
Approximate Computing Using Voltage Over Scaling Technique for Image Compression
  • Dec 10, 2025
  • International Journal of Electrical and Electronics Research
  • Junqi Huang + 2 more

Approximate computing has extensively been adopted as a fault-tolerant method to achieve energy-efficient designs in image processing. This paper introduces a novel, integrated approximate approach for implementing runtime-based voltage over scaling (VOS) at both the circuit and algorithmic levels, specifically for approximate discrete cosine transform (ADCT) and zigzag low-complexity approximate DCT (ZLCADCT) in image compression. In the proposed VOS scheme, the supply voltage of exact and approximate adder cells is reduced below the nominal level, causing the output delay to surpass the worst-case delay and generating errors in addition, while lowering energy consumption. A mathematical model applicable to both exact and approximate adder cells using VOS is first presented. The results from this model align closely with simulation outcomes, validating its accuracy. Subsequently, an exhaustive simulation of 4-bit and 8-bit subtraction, followed by ADCT and ZLCADCT, is conducted using VOS. The error rate (ER) normalized mean error distance (NMED) and mean relative error distance (MRED) for the subtractor with approximate cells are significantly lower than those for the subtractor with exact cells under VOS conditions. In ADCT, approximate full adders can operate at lower supply voltages (around 0.77V) than exact full adders (around 0.83V) without a significant loss in Peak Signal-to-Noise Ratio (PSNR). As the number of approximate bits (NAB) increases, the total energy dissipation of ADCT decreases by 33.2%, with an additional 20% reduction achieved through the application of ZLCADCT with VOS.

  • Research Article
  • Cite Count Icon 2
  • 10.1109/les.2025.3539308
Design of a Hardware-Efficient Approximate 4-2 Compressor for Multiplications in Image Processing
  • Aug 1, 2025
  • IEEE Embedded Systems Letters
  • Sungyoun Hwang + 2 more

This letter presents a novel hardware-efficient approximate 4-2 compressor design that significantly enhances accuracy through a systematic analysis of input patterns obtained from practical applications. We incorporate a majority operation and a compound gate in the compressor design to effectively boost hardware efficiency in multiplications. Our design approach results in substantial error reductions, with normalized mean error distance (NMED) and mean relative error distance (MRED) decreasing by up to 74.84% and 82.04%, respectively, compared to existing approximate multipliers discussed in this letter. When implemented in a 32-nm CMOS technology, the approximate multiplier adopting the proposed 4-2 compressor achieves excellent hardware efficiency, reducing area, power, and energy consumption by up to 8.95%, 13.02%, and 13.02%, respectively, compared to the other alternatives. Moreover, our design delivers enhanced performance in image processing tasks, achieving up to a <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$4.84\times $ </tex-math></inline-formula> increase in peak signal-to-noise ratio (PSNR) compared to other designs, all while optimizing hardware efficiency.

  • Research Article
  • Cite Count Icon 9
  • 10.5573/ieiespc.2019.8.6.506
A Novel Approximate Adder with Enhanced Low-cost Carry Prediction for Error Tolerant Computing
  • Dec 31, 2019
  • IEIE Transactions on Smart Processing &amp; Computing
  • Yongtae Kim

This paper proposes an approximate adder that employs a novel carry speculation scheme to enhance the computation precision of the existing error tolerant adder (ETA) designs with extremely little hardware overhead. The proposed carry prediction technique leverages two input bits to increase the prediction accuracy while the conventional ones do only one bit. This leads to a reduction of the carry prediction error rate from 25% to 18.75%. Compared to the existing ETA design, the proposed adder reduces normalized mean error distance (NMED) and mean relative error distance (MRED) by up to 10% and 28%, respectively, at the cost of only a two-input OR gate. Moreover, the proposed design outperforms the conventional ETAs when jointly evaluating hardware cost and computation accuracy. Specifically, the new design allows 11% and 17% reductions of area-power-NMED and power-NMED products, respectively, compared to the traditional ETA.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant