OMPQ: Orthogonal Mixed Precision Quantization

Yuexiao Ma,Xiawu Zheng,Taisong Jin,Yongjian Wu,Wei Zhang,Rongrong Ji,Huixia Li,Guannan Jiang,Yan Wang

doi:10.1609/aaai.v37i7.26084

Abstract

To bridge the ever-increasing gap between deep neural networks' complexity and hardware capability, network quantization has attracted more and more research attention. The latest trend of mixed precision quantization takes advantage of hardware's multiple bit-width arithmetic operations to unleash the full potential of network quantization. However, existing approaches rely heavily on an extremely time-consuming search process and various relaxations when seeking the optimal bit configuration. To address this issue, we propose to optimize a proxy metric of network orthogonality that can be efficiently solved with linear programming, which proves to be highly correlated with quantized model accuracy and bit-width. Our approach significantly reduces the search time and the required data amount by orders of magnitude, but without a compromise on quantization accuracy. Specifically, we achieve 72.08% Top-1 accuracy on ResNet-18 with 6.7Mb parameters, which does not require any searching iterations. Given the high efficiency and low data dependency of our algorithm, we use it for the post-training quantization, which achieves 71.27% Top-1 accuracy on MobileNetV2 with only 1.5Mb parameters.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

OMPQ: Orthogonal Mixed Precision Quantization

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: Jun 26, 2023
Citations: 3

Similar Papers

GenSyth: a new way to understand deep learning
Alexander Wong ... Francis Li
Electronics Letters | VOL. 55
Alexander Wong, et. al.Alexander Wong ... Francis Li
01 Sep 2019
Electronics Letters | VOL. 55

RDO-Q: Extremely Fine-Grained Channel-Wise Quantization via Rate-Distortion Optimization
Zhe Wang ... Jie Lin
-
Zhe Wang, et. al.Zhe Wang ... Jie Lin
01 Jan 2021
01 Jan 2021

NetScore: Towards Universal Metrics for Large-Scale Performance Analysis of Deep Neural Networks for Practical On-Device Edge Usage
Alexander Wong
-
Alexander WongAlexander Wong
01 Jan 2019
01 Jan 2019

L2L: A Highly Accurate Log_2_Lead Quantization of Pre-trained Neural Networks
Salim Ullah ... Siddharth Gupta
-
Salim Ullah, et. al.Salim Ullah ... Siddharth Gupta
01 Mar 2020
01 Mar 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

OMPQ: Orthogonal Mixed Precision Quantization

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence