Design and Analysis of a Neural Network Inference Engine Based on Adaptive Weight Compression

Jong Hwan Ko,Saibal Mukhopadhyay,Duckhwan Kim,Taesik Na

doi:10.1109/tcad.2018.2801228

Jong Hwan Ko, Saibal Mukhopadhyay + Show 2 more

Open Access

https://doi.org/10.1109/tcad.2018.2801228

Copy DOI

Abstract

Neural networks generally require significant memory capacity/bandwidth to store/access a large number of synaptic weights. This paper presents design of an energy-efficient neural network inference engine based on adaptive weight compression using a JPEG image encoding algorithm. To maximize compression ratio with minimum accuracy loss, the quality factor of the JPEG encoder is adaptively controlled depending on the accuracy impact of each block. With 1% accuracy loss, the proposed approach achieves $63.4{\times }$ compression for multilayer perceptron (MLP) and $31.3 {\times }$ for LeNet-5 with the MNIST dataset, and $15.3 {\times }$ for AlexNet and $10.2 {\times }$ for ResNet-50 with ImageNet. The reduced memory requirement leads to higher throughput and lower energy for neural network inference ( $3 {\times }$ effective memory bandwidth and $22 {\times }$ lower system energy for MLP).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems	Publication Date: Jan 1, 2019
Citations: 39	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

Design and Analysis of a Neural Network Inference Engine Based on Adaptive Weight Compression

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Lead the way for us

Similar Papers

Masking Feedforward Neural Networks Against Power Analysis Attacks
Konstantinos Athanasiou ... A Adam Ding
Proceedings on Privacy Enhancing Technologies | VOL. 2022
Konstantinos Athanasiou, et. al.Konstantinos Athanasiou ... A Adam Ding
20 Nov 2021
Proceedings on Privacy Enhancing Technologies | VOL. 2022

Remote Diagnosis and Control Expert System for Citrus Agricultural Diseases and Insect Pests Based on BP Neural Network and WebGIS
Xiao Laisheng ... Wu Min
-
Xiao Laisheng, et. al.Xiao Laisheng ... Wu Min
01 Jan 2009
01 Jan 2009

Adaptive weight compression for memory-efficient neural networks
Jong Hwan Ko ... Jaeha Kung
-
Jong Hwan Ko, et. al.Jong Hwan Ko ... Jaeha Kung
01 Mar 2017
01 Mar 2017

Research and Implementation of High Computational Power for Training and Inference of Convolutional Neural Networks
Tianling Li ... Yangyang Zheng
Applied Sciences | VOL. 13
Tianling Li, et. al.Tianling Li ... Yangyang Zheng
11 Jan 2023
Applied Sciences | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Design and Analysis of a Neural Network Inference Engine Based on Adaptive Weight Compression

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems