The Effects of Approximate Multiplication on Convolutional Neural Networks

Min Soo Kim,Nader Bagherzadeh,Hyunjin Kim,Alberto A Del Barrio

doi:10.1109/tetc.2021.3050989

Abstract

This article analyzes the effects of approximate multiplication when performing inferences on deep convolutional neural networks (CNNs). The approximate multiplication can reduce the cost of the underlying circuits so that CNN inferences can be performed more efficiently in hardware accelerators. The study identifies the critical factors in the convolution, fully-connected, and batch normalization layers that allow more accurate CNN predictions despite the errors from approximate multiplication. The same factors also provide an arithmetic explanation of why bfloat16 multiplication performs well on CNNs. The experiments are performed with recognized network architectures to show that the approximate multipliers can produce predictions that are nearly as accurate as the FP32 references, without additional training. For example, the ResNet and Inception-v4 models with Mitch-<inline-formula><tex-math notation="LaTeX">$w$</tex-math><alternatives><mml:math><mml:mi>w</mml:mi></mml:math><inline-graphic xlink:href="kim-ieq1-3050989.gif"/></alternatives></inline-formula>6 multiplication produces Top-5 errors that are within 0.2 percent compared to the FP32 references. A brief cost comparison of Mitch-<inline-formula><tex-math notation="LaTeX">$w$</tex-math><alternatives><mml:math><mml:mi>w</mml:mi></mml:math><inline-graphic xlink:href="kim-ieq2-3050989.gif"/></alternatives></inline-formula>6 against bfloat16 is presented where a MAC operation saves up to 80 percent of energy compared to the bfloat16 arithmetic. The most far-reaching contribution of this article is the analytical justification that multiplications can be approximated while additions need to be exact in CNN MAC operations.

Highlights

T HE computational costs of convolutional neural networks (CNNs) have increased as CNNs get wider and deeper to perform better predictions for a variety of applications
The convolution layers in CNNs consist of a large number of multiply-accumulate (MAC) operations and they take up the majority of computations for CNN inferences [11]
The network dependency is the reason why more complex networks require a higher number of bits and the benefits of aggressive quantization to quantization as approximate multipliers may be designed for any number of bits, and it complements quantization to maximize the computational efficiency of CNN inferences

Summary

INTRODUCTION

T HE computational costs of convolutional neural networks (CNNs) have increased as CNNs get wider and deeper to perform better predictions for a variety of applications. Some techniques are computationally expensive in order to optimize their methods for each network model, or to retrain networks to compensate for the performance degradation from their methods [5], [6] Many techniques such as [7] are only effective for small networks and cannot scale to deeper CNNs as they report much worse performance results when tested for deeper networks. One promising hardware-based approach is the application of approximate multiplication to CNN inference [9] It involves designing and applying multiplication circuits that have reduced hardware costs but produce results that are not exact. While optimizing CNN inference through approximate multiplication was demonstrated in several previous studies, there was limited understanding of why it worked well for CNNs. The promising results led to the general observation that CNNs were resilient against small arithmetic errors, but none of them identified the complete reason behind that resilience. Discussing the potential cost benefits of the methodology by briefly comparing the hardware costs against those of bfloat arithmetic

PRELIMINARIES

ACCUMULATED ERROR IN CONVOLUTION

Understanding Convolution and FC Layers

Minimized Variance of Error

Impact on Convolution and FC

Grouped and Depthwise Convolutions

EFFECT OF BATCH NORMALIZATION

ARITHMETIC REASON FOR BFLOAT16 SUCCESS

EXPERIMENTS

Impact of Approximate Multiplication on CNNs

Effect of Batch Normalization

COMPARISON OF COSTS

RELATED WORKS

Findings

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Emerging Topics in Computing	Publication Date: Apr 1, 2022
Citations: 44	License type: cc-by

R Discovery Prime

R Discovery Prime

The Effects of Approximate Multiplication on Convolutional Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Emerging Topics in Computing

Lead the way for us

Similar Papers

An Empirical Study on Position of the Batch Normalization Layer in Convolutional Neural Networks
Moein Hasani ... Hassan Khotanlou
-
Moein Hasani, et. al.Moein Hasani ... Hassan Khotanlou
01 Dec 2019
01 Dec 2019

Research on improved convolutional wavelet neural network
Jingwei Liu ... Xuehan Tang
Scientific Reports | VOL. 11
Jingwei Liu, et. al.Jingwei Liu ... Xuehan Tang
09 Sep 2021
Scientific Reports | VOL. 11

14.2 DNPU: An 8.1TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks
Dongjoo Shin ... Jinmook Lee
-
Dongjoo Shin, et. al.Dongjoo Shin ... Jinmook Lee
01 Feb 2017
14.2 DNPU: An 8.1TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks
Dongjoo Shin ... Jinmook Lee

Wafer map failure pattern recognition based on deep convolutional neural network
Shouhong Chen ... Ping Yang
Expert Systems with Applications | VOL. 209
Shouhong Chen, et. al.Shouhong Chen ... Ping Yang
23 Jul 2022
Expert Systems with Applications | VOL. 209

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

The Effects of Approximate Multiplication on Convolutional Neural Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Emerging Topics in Computing