Abstract
The work proposes a new multiply-and-accumulate (MAC) processing unit structure that is highly suitable for on-device convolutional neural networks (CNNs). By observing that the bit-lengths to represent the numerical values of the input/output neurons and weight parameters in on-device CNNs should be small (i.e., low precisions), usually no more than 9 bits, and vary across network layers, we propose a layer-by-layer composable MAC unit structure that is best suited to the ‘majority’ of the operations with low precisions through a maximal parallelism of the MAC operations in the unit with very little subsidiary processing overhead while being sufficiently effective in MAC unit resource utilization for the rest of operations. Precisely, two essences of this work are: (1) our MAC unit structure supports two operation modes, (mode-0) operating a single multiplier for every majority multiplication of low precisions and (mode-1) operating multiple (‘a minimal number of’) multipliers for the rest of multiplications of high precisions; (2) for a set of input CNNs, we formulate the exploration of the size of a single internal multiplier in MAC unit to derive an ‘economical’ instance, in terms of computation and energy cost, of MAC unit structure across the whole network layers. Our strategy is in a strong contrast with the conventional MAC unit design, in which the MAC input size should be large enough to cover the largest bit-size of the activation inputs/outputs and weight parameters. We show analytically and empirically that our MAC unit structure with the exploration of its instances is very effective, reducing computation cost per multiplication operation by 4.68∼30.3% and saving energy cost by 43.3% on average for the convolutional operations in AlexNet and VGG-16 over the use of the conventional MAC unit structures.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.