Quantization noise in low bit quantization and iterative adaptation to quantization noise in quantizable neural networks

D Chudakov,A Goncharenko,S Alyamkin,A Densidov

doi:10.1088/1742-6596/2134/1/012004

Quantization noise in low bit quantization and iterative adaptation to quantization noise in quantizable neural networks

D Chudakov, A Goncharenko + Show 2 more

Open Access

https://doi.org/10.1088/1742-6596/2134/1/012004

Copy DOI

Journal: Journal of Physics: Conference Series	Publication Date: Dec 1, 2021
License type: cc-by

Affiliation: Novosibirsk State University

#Low-bit Quantization #Quantization Noise + Show 8 more

Abstract
Full-Text
Similar Papers

Abstract

Quantization is one of the most popular and widely used methods of speeding up a neural network. At the moment, the standard is 8-bit uniform quantization. Nevertheless, the use of uniform low-bit quantization (4- and 6-bit quantization) has significant advantages in speed and resource requirements for inference. We present our quantization algorithm that offers advantages when using uniform low-bit quantization. It is faster than quantization-aware training from scratch and more accurate than methods aimed only at selecting thresholds and reducing noise from quantization. We also investigated quantization noise in neural networks for low-bit quantization and concluded that quantization noise is not always a good metric for quantization quality.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Journal of Physics: Conference Series

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.