Abstract
Among various network compression methods, network quantization has developed rapidly due to its superior compression performance. However, trivial activation quantization schemes limit the compression performance of network quantization. Most conventional activation quantization methods directly utilize the rectified activation functions to quantize models, yet their unbounded outputs generally yield drastic accuracy degradation. To tackle this problem, we propose a comprehensive activation quantization technique namely Bilateral Clipping Parametric Rectified Linear Unit (BCPReLU) as a generalized version of all rectified activation functions, which limits the quantization range more flexibly during training. Specifically, trainable slopes and thresholds are introduced for both positive and negative inputs to find more flexible quantization scales. We theoretically demonstrate that BCPReLU has approximately the same expressive power as the corresponding unbounded version and establish its convergence in low-bit quantization networks. Extensive experiments on a variety of datasets and network architectures demonstrate the effectiveness of our trainable clipping activation function.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.