Learning Low-Precision Structured Subnetworks Using Joint Layerwise Channel Pruning and Uniform Quantization

Xinyu Zhang,Ian Colbert,Srinjoy Das

doi:10.3390/app12157829

Abstract

Pruning and quantization are core techniques used to reduce the inference costs of deep neural networks. Among the state-of-the-art pruning techniques, magnitude-based pruning algorithms have demonstrated consistent success in the reduction of both weight and feature map complexity. However, we find that existing measures of neuron (or channel) importance estimation used for such pruning procedures have at least one of two limitations: (1) failure to consider the interdependence between successive layers; and/or (2) performing the estimation in a parametric setting or by using distributional assumptions on the feature maps. In this work, we demonstrate that the importance rankings of the output neurons of a given layer strongly depend on the sparsity level of the preceding layer, and therefore, naïvely estimating neuron importance to drive magnitude-based pruning will lead to suboptimal performance. Informed by this observation, we propose a purely data-driven nonparametric, magnitude-based channel pruning strategy that works in a greedy manner based on the activations of the previous sparsified layer. We demonstrate that our proposed method works effectively in combination with statistics-based quantization techniques to generate low precision structured subnetworks that can be efficiently accelerated by hardware platforms such as GPUs and FPGAs. Using our proposed algorithms, we demonstrate increased performance per memory footprint over existing solutions across a range of discriminative and generative networks.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Aug 4, 2022
Citations: 3	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Learning Low-Precision Structured Subnetworks Using Joint Layerwise Channel Pruning and Uniform Quantization

Abstract

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

AGT: Channel Pruning Using Adaptive Gradient Training for Accelerating Convolutional Neural Networks
Nam Joon Kim ... Hyun Kim
-
Nam Joon Kim, et. al.Nam Joon Kim ... Hyun Kim
05 Feb 2023
05 Feb 2023

Decouple and Stretch: A Boost to Channel Pruning
Zhen Chen ... Weiping Li
-
Zhen Chen, et. al.Zhen Chen ... Weiping Li
01 Nov 2018
01 Nov 2018

Multi-Loss-Aware Channel Pruning of Deep Networks
Yiming Hu ... Jianquan Li
-
Yiming Hu, et. al.Yiming Hu ... Jianquan Li
01 Sep 2019
01 Sep 2019

A multiresolution mixture generative adversarial network for video super-resolution.
Zhiqiang Tian ... Xuguang Lan
PloS one | VOL. 15
Zhiqiang Tian, et. al.Zhiqiang Tian ... Xuguang Lan
10 Jul 2020
PloS one | VOL. 15

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Learning Low-Precision Structured Subnetworks Using Joint Layerwise Channel Pruning and Uniform Quantization

Abstract

Talk to us

Similar Papers

More From: Applied Sciences