Two-Step Quantization for Low-bit Neural Networks

Peisong Wang,Chunjie Zhang,Jian Cheng,Yifan Zhang,Qinghao Hu,Yang Liu

doi:10.1109/cvpr.2018.00460

Abstract

Every bit matters in the hardware design of quantized neural networks. However, extremely-low-bit representation usually causes large accuracy drop. Thus, how to train extremely-low-bit neural networks with high accuracy is of central importance. Most existing network quantization approaches learn transformations (low-bit weights) as well as encodings (low-bit activations) simultaneously. This tight coupling makes the optimization problem difficult, and thus prevents the network from learning optimal representations. In this paper, we propose a simple yet effective Two-Step Quantization (TSQ) framework, by decomposing the network quantization problem into two steps: code learning and transformation function learning based on the learned codes. For the first step, we propose the sparse quantization method for code learning. The second step can be formulated as a non-linear least square regression problem with low-bit constraints, which can be solved efficiently in an iterative manner. Extensive experiments on CIFAR-10 and ILSVRC-12 datasets demonstrate that the proposed TSQ is effective and outperforms the state-of-the-art by a large margin. Especially, for 2-bit activation and ternary weight quantization of AlexNet, the accuracy of our TSQ drops only about 0.5 points compared with the full-precision counterpart, outperforming current state-of-the-art by more than 5 points.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Two-Step Quantization for Low-bit Neural Networks

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Low-Bitwidth Convolutional Neural Networks for Wireless Interference Identification
Pengyu Wang ... Shaoqian Li
IEEE Transactions on Cognitive Communications and Networking | VOL. 8
Pengyu Wang, et. al.Pengyu Wang ... Shaoqian Li
01 Jun 2022
IEEE Transactions on Cognitive Communications and Networking | VOL. 8

Quantum Neural Network for Image Classification Using TensorFlow Quantum
J Arun Pandian ... K Kanchanadevi
-
J Arun Pandian, et. al.J Arun Pandian ... K Kanchanadevi
01 Jan 2023
01 Jan 2023

Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy
Maarten Vandersteegen ... Kristof Van Beeck
Electronics | VOL. 10
Maarten Vandersteegen, et. al.Maarten Vandersteegen ... Kristof Van Beeck
17 Nov 2021
Electronics | VOL. 10

Synchronous Weight Quantization-Compression for Low-Bit Quantized Neural Network
Yuzhong Jiao ... Xiao Huo
-
Yuzhong Jiao, et. al.Yuzhong Jiao ... Xiao Huo
18 Jul 2021
18 Jul 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Two-Step Quantization for Low-bit Neural Networks

Abstract

Talk to us

Similar Papers