Long-range zero-shot generative deep network quantization

Yan Luo,Yangcheng Gao,Zhao Zhang,Jicong Fan,Haijun Zhang,Mingliang Xu

doi:10.1016/j.neunet.2023.07.042

Abstract

Quantization approximates a deep network model with floating-point numbers by the model with low bit width numbers, thereby accelerating inference and reducing computation. Zero-shot quantization, which aims to quantize a model without access to the original data, can be achieved by fitting the real data distribution through data synthesis. However, it has been observed that zero-shot quantization leads to inferior performance compared to post-training quantization with real data for two primary reasons: 1) a normal generator has difficulty obtaining a high diversity of synthetic data since it lacks long-range information to allocate attention to global features, and 2) synthetic images aim to simulate the statistics of real data, which leads to weak intra-class heterogeneity and limited feature richness. To overcome these problems, we propose a novel deep network quantizer called long-range zero-shot generative deep network quantization (LRQ). Technically, we propose a long-range generator (LRG) to learn long-range information instead of simple local features. To incorporate more global features into the synthetic data, we use long-range attention with large-kernel convolution in the generator. In addition, we also present an adversarial margin add (AMA) module to force intra-class angular enlargement between the feature vector and class center. The AMA module forms an adversarial process that increases the convergence difficulty of the loss function, which is opposite to the training objective of the original loss function. Furthermore, to transfer knowledge from the full-precision network, we also utilize decoupled knowledge distillation. Extensive experiments demonstrate that LRQ obtains better performance than other competitors.

Full Text