AxP: A HW-SW Co-Design Pipeline for Energy-Efficient Approximated ConvNets via Associative Matching

Luca Mocerino,Andrea Calimera

doi:10.3390/app112311164

Abstract

The reduction in energy consumption is key for deep neural networks (DNNs) to ensure usability and reliability, whether they are deployed on low-power end-nodes with limited resources or high-performance platforms that serve large pools of users. Leveraging the over-parametrization shown by many DNN models, convolutional neural networks (ConvNets) in particular, energy efficiency can be improved substantially preserving the model accuracy. The solution proposed in this work exploits the intrinsic redundancy of ConvNets to maximize the reuse of partial arithmetic results during the inference stages. Specifically, the weight-set of a given ConvNet is discretized through a clustering procedure such that the largest possible number of inner multiplications fall into predefined bins; this allows an off-line computation of the most frequent results, which in turn can be stored locally and retrieved when needed during the forward pass. Such a reuse mechanism leads to remarkable energy savings with the aid of a custom processing element (PE) that integrates an associative memory with a standard floating-point unit (FPU). Moreover, the adoption of an approximate associative rule based on a partial bit-match increases the hit rate over the pre-computed results, maximizing the energy reduction even further. Results collected on a set of ConvNets trained for computer vision and speech processing tasks reveal that the proposed associative-based hw-sw co-design achieves up to 77% in energy savings with less than 1% in accuracy loss.

Highlights

In the last decade, convolutional neural networks (ConvNets) have outclassed traditional machine learning algorithms in several tasks, from image classification [1,2] to audio [3,4] and natural language processing [5,6]
The experiments conducted on computer vision tasks and keyword spotting reveal that our approach achieves up to 77% of energy savings with a negligible accuracy loss (
Razlighi et al in [18] proposed a look-up search into a special content-addressable memory (CAM) mapped onto a resistive technology as a substitute for multiply-and-accumulate (MAC) units. This approach targeted simple multilayer perceptrons (MLPs), which account for fully connected layers only, while it is known that convolutions layers dominate the energy consumption in ConvNets [41,42]

Summary

Introduction

Convolutional neural networks (ConvNets) have outclassed traditional machine learning algorithms in several tasks, from image classification [1,2] to audio [3,4] and natural language processing [5,6]. Approximations can be applied at different levels by means of different knobs: (i) the data format, with mini-floats [9,10] or fixed-point quantization [11,12,13]; (ii) the arithmetic precision, replacing exact multiplications with an approximate version [14,15]; (iii) the algorithmic structure, for instance simplifying standard convolutions with an alternative formulation, such as Winograd [16] or frequency domain convolution [3]. The convolutional layers are characterized by stencil loops that update array elements according to fixed patterns, thereby producing repetitive workloads with a high degree of temporal and spatial locality This offers the opportunity to implement reuse mechanisms that alleviate the computational workload. A final softmax layer calculates the output probability score across the available classes

ConvNets Approximation via Arithmetic Approximation and Data-Reuse

Co-Design Pipeline

Hardware Design

Results

Software Design

Clustering Engine

APMA Engine

Understanding Co-Design Knobs

Simulation Engine

Hardware and Software Setup

Weight Approximation Pipeline

Input Activations Profiling

Approximate Pattern Matching on Input Activation

Energy-Accuracy Trade-Off and Comparison with Previous Works

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Nov 24, 2021
Citations: 1	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

AxP: A HW-SW Co-Design Pipeline for Energy-Efficient Approximated ConvNets via Associative Matching

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

A comparative evaluation of deep convolutional neural network and deep neural network-based land use/land cover classifications of mining regions using fused multi-sensor satellite data
Ajay Kumar ... Amit Kumar Gorai
Advances in Space Research | VOL. 72
Ajay Kumar, et. al.Ajay Kumar ... Amit Kumar Gorai
04 Sep 2023
Advances in Space Research | VOL. 72

Comparative analysis on Deep Convolution Neural Network models using Pytorch and OpenCV DNN frameworks for identifying optimum fruit detection solution on RISC-V architecture
Shalini K ... Surendra Allam
-
Shalini K, et. al.Shalini K ... Surendra Allam
24 Oct 2021
24 Oct 2021

Reusing Convolutional Neural Network Models through Modularization and Composition
Binhang Qi ... Xiang Gao
ACM Transactions on Software Engineering and Methodology | VOL. 33
Binhang Qi, et. al.Binhang Qi ... Xiang Gao
15 Mar 2024
ACM Transactions on Software Engineering and Methodology | VOL. 33

RiSA: A Reinforced Systolic Array for Depthwise Convolutions and Embedded Tensor Reshaping
Hyungmin Cho
ACM Transactions on Embedded Computing Systems | VOL. 20
Hyungmin ChoHyungmin Cho
17 Sep 2021
ACM Transactions on Embedded Computing Systems | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

AxP: A HW-SW Co-Design Pipeline for Energy-Efficient Approximated ConvNets via Associative Matching

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences