Quantization Robust Pruning With Knowledge Distillation

Jangho Kim

doi:10.1109/access.2023.3257864

Jangho Kim

Open Access

PDF Available

https://doi.org/10.1109/access.2023.3257864

Copy DOI

Export

Save

Cite

Journal: IEEE Access	Publication Date: Jan 1, 2023
Citations: 5	License type: CC BY-NC-ND 4.0

Affiliation: Kookmin University

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

To resolve the problem that deep neural networks (DNN) require a large number of network parameters, many researchers have sought to compress the network. Network pruning, quantization and knowledge distillation have been studied for this purpose. Considering realistic scenarios such as deploying DNN on the resource constraint device where the network uploaded in the device performs wells in various bit-widths without re-training and the network with reasonable performance, we propose quantization robust pruning with knowledge distillation (QRPK) method. In QRPK, model weights are divided into essential weigths and inessential weights based on their magnitude value. Then, QRPK trains the quantization robustness model with a high pruning ratio by making the distribution of essential weights as a quantization friendly distribution. We conducted experiments on CIFAR-10 and CIFAR-100 to verify the effectiveness of QRPK and a QRPK trained model performs well in various bit-width, as it designed by pruning, quantization robustness and knowledge distillation.

Full Text