Abstract

The radix-4 Booth algorithm is widely used to improve the performance of multiplier because it can reduce the number of partial products by half. However, numerous additional encoders and decoders would cause the power consumption of the Booth multiplier to be considerable. In this paper, a new radix-4 Booth pre-encoded mechanism is proposed to reduce the power consumption of the Booth multiplier. The proposed design can effectively reduce the power of the Booth multiplier dissipated in the redundant activities by disabling the Booth encoders and decoders from unnecessary working. Particularly, since the control signals are generated early at the pipeline input register before the multiplier, the performance of our design is better than the traditional Booth multiplier. Based on the TSMC 40 nm technology, the simulation results show that the proposed pre-encoded mechanism can reduce the dynamic and static power by 45% and 65%, respectively, compared to the traditional 16-bit radix-4 Booth multiplier. Compared to the previous designs, the proposed design keeps the feature of race-free and has lower power consumption. Even compared to the approximate design, the proposed design has better power efficiency and can provide the exact products.

Highlights

  • Many digital signal processing (DSP) and machine learning applications are heavily dominated by multiplication [1]–[4], e.g., more than 90% convolutional neural networks (CNN) computations are occupied by multiply-accumulate (MAC) operations [5], [6]

  • SIMULATION RESULTS In this paper, the related works [16], [18], [19], [22], and the proposed pre-encoded mechanism are simulated by using Taiwan Semiconductor Manufacturing Company (TSMC) 40 nm CMOS technology

  • We simulate generating one partial products (PPs) row of n-bit multiplication (n = 8 or n = 16); we provide the power consumption, the performance, and transistor count (TC) to show the effectiveness of the proposed pre-encoded mechanism

Read more

Summary

INTRODUCTION

Many digital signal processing (DSP) and machine learning applications are heavily dominated by multiplication [1]–[4], e.g., more than 90% convolutional neural networks (CNN) computations are occupied by multiply-accumulate (MAC) operations [5], [6]. The author of [16] proposed the glitch-free Booth encoder and partial product generator to eliminate the unnecessary glitches of the radix-4 Booth multiplier These traditional designs still suffer from high power consumption and high cost of Booth encoders and decoders. We propose a radix-4 Booth multiplier with pre-encoded mechanism to improve the power efficiency of multiplication. One specific feature of the radix-4 Booth algorithm is that when the continuous three bits of multiplier Y (y2i+1, y2i, y2i−1) have the same values, the corresponding PPs will be 0 This feature inspired us to find the ‘‘0X ’’ case earlier to reduce the unnecessary switching activities of the radix-4 Booth encoders and decoders. We introduce the traditional radix-4 Booth algorithm and the related works

TRADITIONAL RADIX-4 BOOTH ALGORITHM
SIMULATION RESULTS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call