PV-MAC: Multiply-and-accumulate unit structure exploiting precision variability in on-device convolutional neural networks

Jongsung Kang,Taewhan Kim

doi:10.1016/j.vlsi.2019.11.003

Abstract

The work proposes a new multiply-and-accumulate (MAC) processing unit structure that is highly suitable for on-device convolutional neural networks (CNNs). By observing that the bit-lengths to represent the numerical values of the input/output neurons and weight parameters in on-device CNNs should be small (i.e., low precisions), usually no more than 9 bits, and vary across network layers, we propose a layer-by-layer composable MAC unit structure that is best suited to the ‘majority’ of the operations with low precisions through a maximal parallelism of the MAC operations in the unit with very little subsidiary processing overhead while being sufficiently effective in MAC unit resource utilization for the rest of operations. Precisely, two essences of this work are: (1) our MAC unit structure supports two operation modes, (mode-0) operating a single multiplier for every majority multiplication of low precisions and (mode-1) operating multiple (‘a minimal number of’) multipliers for the rest of multiplications of high precisions; (2) for a set of input CNNs, we formulate the exploration of the size of a single internal multiplier in MAC unit to derive an ‘economical’ instance, in terms of computation and energy cost, of MAC unit structure across the whole network layers. Our strategy is in a strong contrast with the conventional MAC unit design, in which the MAC input size should be large enough to cover the largest bit-size of the activation inputs/outputs and weight parameters. We show analytically and empirically that our MAC unit structure with the exploration of its instances is very effective, reducing computation cost per multiplication operation by 4.68∼30.3% and saving energy cost by 43.3% on average for the convolutional operations in AlexNet and VGG-16 over the use of the conventional MAC unit structures.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

PV-MAC: Multiply-and-accumulate unit structure exploiting precision variability in on-device convolutional neural networks

Abstract

Talk to us

Similar Papers

More From: Integration

Lead the way for us

Journal: Integration	Publication Date: Nov 25, 2019
Citations: 6

Similar Papers

Design of a Novel Optimized MAC Unit using Modified Fault Tolerant Vedic Multiplier
R Deepa ... A Shanmugam
Research Journal of Applied Sciences, Engineering and Technology | VOL. 8
R Deepa, et. al.R Deepa ... A Shanmugam
20 Aug 2014
Research Journal of Applied Sciences, Engineering and Technology | VOL. 8

Efficient multiply-add unit specified for DSPs utilizing low-power pipeline modulo 2n+1 multiplier
Negar Akbarzadeh ... Somayeh Timarchi
-
Negar Akbarzadeh, et. al.Negar Akbarzadeh ... Somayeh Timarchi
01 Nov 2015
01 Nov 2015

Analysis of 32-Bit Multiply and Accumulate unit (MAC) using Vedic Multiplier
K Lilly ... B Manvitha
-
K Lilly, et. al.K Lilly ... B Manvitha
01 Feb 2020
01 Feb 2020

Analog Convolutional Operator Circuit for Low-Power Mixed-Signal CNN Processing Chip
Malik Summair Asghar ... Hyungwon Kim
Sensors | VOL. 23
Malik Summair Asghar, et. al.Malik Summair Asghar ... Hyungwon Kim
04 Dec 2023
Sensors | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PV-MAC: Multiply-and-accumulate unit structure exploiting precision variability in on-device convolutional neural networks

Abstract

Talk to us

Similar Papers

More From: Integration