Abstract
Quantizing weights and activations of deep neural networks is essential for deploying them in resource-constrained devices, or cloud platforms for at-scale services. While binarization is a special case of quantization, this extreme case often leads to several training difficulties, and necessitates specialized models and training methods. As a result, recent quantization methods do not provide binarization, thus losing the most resource-efficient option, and quantized and binarized networks have been distinct research areas. We examine binarization difficulties in a quantization framework and find that all we need to enable the binary training are a symmetric quantizer, good initialization, and careful hyperparameter selection. These techniques also lead to substantial improvements in multi-bit quantization. We demonstrate our unified quantization framework, denoted as UniQ, on the ImageNet dataset with various architectures such as ResNet-18,-34 and MobileNetV2. For multi-bit quantization, UniQ outperforms existing methods to achieve the state-of-the-art accuracy. In binarization, the achieved accuracy is comparable to existing state-of-the-art methods even without modifying the original architectures.
Highlights
Deep neural networks have achieved tremendous success in various fields including computer vision [31], natural language processing [52], and speech recognition [8], having demonstrated unprecedented predictive performance
EXPERIMENT RESULTS To demonstrate the effectiveness of our proposed method, we evaluate it on the CIFAR-100 [30] and the ImageNet datasets [45]
The experiment results are compared with various recent works on multi-bit quantization and neural network binarization
Summary
Deep neural networks have achieved tremendous success in various fields including computer vision [31], natural language processing [52], and speech recognition [8], having demonstrated unprecedented predictive performance. The signed integer is assumed to be represented by two’s complement, which has asymmetric ranges This quantization method does not transform the weights of the pre-trained models. According to our ablation study, our proposed framework shows significant improvements over prior works as a combined result of the symmetric quantizer and the optimal initialization. We scrutinize the training dynamics of the binary case and find that the binary case receives strong gradient signals at the beginning of training and the distribution of quantizer input changes extremely fast compared to multi-bit cases We hypothesize that this difference is caused as the initial point after binarization is too far from the pre-trained model solution. Architectures, meaning that our method can be used in conjunction with network modification techniques. We propose an optimal, analytic initialization for step sizes
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have