Fast Adjustable Threshold for Uniform Neural Network Quantization

Alexander Goncharenko ,Andrey Denisov ,Sergey Alyamkin ,Evgeny Terentev

doi:10.5281/zenodo.3461980

Abstract

Neural network quantization procedure is the necessary step for porting of neural networks to mobile devices. Quantization allows accelerating the inference, reducing memory consumption and model size. It can be performed without fine-tuning using calibration procedure (calculation of parameters necessary for quantization), or it is possible to train the network with quantization from scratch. Training with quantization from scratch on the labeled data is rather long and resource-consuming procedure. Quantization of network without fine-tuning leads to accuracy drop because of outliers which appear during the calibration. In this article we suggest to simplify the quantization procedure significantly by introducing the trained scale factors for quantization thresholds. It allows speeding up the process of quantization with fine-tuning up to 8 epochs as well as reducing the requirements to the set of train images. By our knowledge, the proposed method allowed us to get the first public available quantized version of MNAS without significant accuracy reduction - 74.8% vs 75.3% for original full-precision network. Model and code are ready for use and available at: this https URL.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Fast Adjustable Threshold for Uniform Neural Network Quantization

Abstract

Talk to us

Similar Papers

More From: Zenodo (CERN European Organization for Nuclear Research)

Lead the way for us

Journal: Zenodo (CERN European Organization for Nuclear Research)	Publication Date: Jan 1, 2018
License type: cc-by

Similar Papers

Trainable Thresholds for Neural Network Quantization
Alexander Goncharenko ... Sergey Alyamkin
-
Alexander Goncharenko, et. al.Alexander Goncharenko ... Sergey Alyamkin
01 Jan 2019
01 Jan 2019

3D Reconstruction Based on Cyclic Multi-View Stereo Network
Fangli Jia ... Yongheng Tang
-
Fangli Jia, et. al.Fangli Jia ... Yongheng Tang
01 Apr 2020
01 Apr 2020

Deep Learning Optimization for Edge Devices: Analysis of Training Quantization Parameters
Alicja Kwasniewska ... Jacek Ruminski
-
Alicja Kwasniewska, et. al.Alicja Kwasniewska ... Jacek Ruminski
01 Oct 2019
01 Oct 2019

Trainable segmentation for transmission electron microscope images of inorganic nanoparticles.
Cameron G Bell ... Kevin P Treder
Journal of Microscopy | VOL. 288
Cameron G Bell, et. al.Cameron G Bell ... Kevin P Treder
11 May 2022
Journal of Microscopy | VOL. 288

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fast Adjustable Threshold for Uniform Neural Network Quantization

Abstract

Talk to us

Similar Papers

More From: Zenodo (CERN European Organization for Nuclear Research)