Learning Sparse Low-Precision Neural Networks With Learnable Regularization

Yoojin Choi,Jungwon Lee,Mostafa El-Khamy

doi:10.1109/access.2020.2996936

Abstract

We consider learning deep neural networks (DNNs) that consist of low-precision weights and activations for efficient inference of fixed-point operations. In training low-precision networks, gradient descent in the backward pass is performed with high-precision weights while quantized low-precision weights and activations are used in the forward pass to calculate the loss function for training. Thus, the gradient descent becomes suboptimal, and accuracy loss follows. In order to reduce the mismatch in the forward and backward passes, we utilize mean squared quantization error (MSQE) regularization. In particular, we propose using a learnable regularization coefficient with the MSQE regularizer to reinforce the convergence of high-precision weights to their quantized values. We also investigate how partial L2 regularization can be employed for weight pruning in a similar manner. Finally, combining weight pruning, quantization, and entropy coding, we establish a low-precision DNN compression pipeline. In our experiments, the proposed method yields low-precision MobileNet and ShuffleNet models on ImageNet classification with the state-of-the-art compression ratios of 7.13 and 6.79, respectively. Moreover, we examine our method for image super resolution networks to produce 8-bit low-precision models at negligible performance loss.

Highlights

Deep neural networks (DNNs) have achieved performance breakthroughs in many of computer vision tasks [1]
We reduce the mismatch between high-precision weights and quantized weights with mean squared quantization error (MSQE) regularization
Our training happens in high precision for its backward passes and gradient descent, its forward passes use quantized low-precision weights and activations, and the resulting networks can be operated on low-precision fixed-point hardware at inference time

Summary

INTRODUCTION

Deep neural networks (DNNs) have achieved performance breakthroughs in many of computer vision tasks [1]. For weight quantization, we regularize (unpruned) weights with another regularization term of the mean squared quantization error (MSQE) In this stage, we quantize the activations (feature maps) of each layer to mimic low-precision operations at inference time. High-precision weights are reinforced to converge to their quantized values gradually in training. The loss-aware weight quantization in [12], [13] proposed the proximal Newton algorithm to minimize the loss function under the constraints of low-precision weights, which is impractical for large-size networks due to the prohibitive computational cost to estimate the Hessian matrix of the loss function. Our method uses the stochastic gradient descent, while the mismatch between high-precision weights and quantized weights is minimized with the MSQE regularization. Scaling factors (i.e., quantization bin sizes) are defined in each layer for fixed-point weights and activations, respectively, to alter their dynamic ranges.

REGULARIZATION FOR WEIGHT QUANTIZATION

REGULARIZATION FOR WEIGHT PRUNING

EXPERIMENTS

EXPERIMENTAL SETTINGS

EXPERIMENTAL RESULTS

Findings

CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2020
Citations: 64	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Learning Sparse Low-Precision Neural Networks With Learnable Regularization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

An Improved K-Spare Decomposing Algorithm for Mapping Neural Networks onto Crossbar-Based Neuromorphic Computing Systems
Thanh D Dao ... Jaeyong Chung
Journal of Low Power Electronics and Applications | VOL. 10
Thanh D Dao, et. al.Thanh D Dao ... Jaeyong Chung
25 Nov 2020
Journal of Low Power Electronics and Applications | VOL. 10

ReluDiff
Brandon Paulsen ... Chao Wang
-
Brandon Paulsen, et. al.Brandon Paulsen ... Chao Wang
27 Jun 2020
27 Jun 2020

Deep Recurrent Neural Network (Deep-RNN) for Classification of Nonlinear Data
Debasmita Mishra ... Bighnaraj Naik
-
Debasmita Mishra, et. al.Debasmita Mishra ... Bighnaraj Naik
01 Jan 2020
01 Jan 2020

DANA
Mohammad Malekzadeh ... Richard Clegg
Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies | VOL. 5
Mohammad Malekzadeh, et. al.Mohammad Malekzadeh ... Richard Clegg
09 Sep 2021
Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning Sparse Low-Precision Neural Networks With Learnable Regularization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions