Ristretto: A Framework for Empirical Study of Resource-Efficient Inference in Convolutional Neural Networks.

Philipp Gysel,Mohammad Motamedi,Soheil Ghiasi,Jon Pimentel

doi:10.1109/tnnls.2018.2808319

Philipp Gysel, Mohammad Motamedi + Show 2 more

Open Access

https://doi.org/10.1109/tnnls.2018.2808319

Copy DOI

Abstract

Convolutional neural networks (CNNs) have led to remarkable progress in a number of key pattern recognition tasks, such as visual scene understanding and speech recognition, that potentially enable numerous applications. Consequently, there is a significant need to deploy trained CNNs to resource-constrained embedded systems. Inference using pretrained modern deep CNNs, however, requires significant system resources, including computation, energy, and memory space. To enable efficient implementation of trained CNNs, a viable approach is to approximate the network with an implementation-friendly model with only negligible degradation in classification accuracy. We present Ristretto, a CNN approximation framework that enables empirical investigation of the tradeoff between various number representation and word width choices and the classification accuracy of the model. Specifically, Ristretto analyzes a given CNN with respect to numerical range required to represent weights, activations, and intermediate results of convolutional and fully connected layers, and subsequently, it simulates the impact of reduced word width or lower precision arithmetic operators on the model accuracy. Moreover, Ristretto can fine-tune a quantized network to further improve its classification accuracy under a given number representation and word width configuration. Given a maximum classification accuracy degradation tolerance of 1%, we use Ristretto to demonstrate that three ImageNet networks can be condensed to use 8-bit dynamic fixed point for network weights and activations. Ristretto is available as a popular open-source software project and has already been viewed over 1,000 times on Github as of the submission of this brief.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE transactions on neural networks and learning systems	Publication Date: Mar 16, 2018
Citations: 219	License type: publisher-specific, author manuscript

R Discovery Prime

R Discovery Prime

Ristretto: A Framework for Empirical Study of Resource-Efficient Inference in Convolutional Neural Networks.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on neural networks and learning systems

Lead the way for us

Similar Papers

Clinically Relevant Vulnerabilities of Deep Machine Learning Systems for Skin Cancer Diagnosis
Manpreet Lakhan ... Fiona M Watt
Journal of Investigative Dermatology | VOL. 141
Manpreet Lakhan, et. al.Manpreet Lakhan ... Fiona M Watt
12 Sep 2020
Journal of Investigative Dermatology | VOL. 141

Real-Time CNN Training and Compression for Neural-Enhanced Adaptive Live Streaming.
Younghui Kim ... Seunghoon Cha
IEEE transactions on pattern analysis and machine intelligence | VOL. PP
Younghui Kim, et. al.Younghui Kim ... Seunghoon Cha
01 Jan 2024
IEEE transactions on pattern analysis and machine intelligence | VOL. PP

Aspects of programming for implementation of convolutional neural networks on multisystem HPC architectures
Naresh Kumar Nagwani ... Sunil Pandey
Journal of Physics: Conference Series | VOL. 2062
Naresh Kumar Nagwani, et. al.Naresh Kumar Nagwani ... Sunil Pandey
01 Nov 2021
Journal of Physics: Conference Series | VOL. 2062

Dynamic Reconfiguration of CNNs for Input-Dependent Approximation
Maedeh Hemmat ... Azadeh Davoodi
-
Maedeh Hemmat, et. al.Maedeh Hemmat ... Azadeh Davoodi
01 Mar 2019
01 Mar 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ristretto: A Framework for Empirical Study of Resource-Efficient Inference in Convolutional Neural Networks.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on neural networks and learning systems