Abstract

As convolutional neural networks have demonstrated state-of-the-art performance in object recognition and detection, there is a growing need for deploying these systems on resource-constrained mobile platforms. However, the computational burden and energy consumption of inference for these networks are significantly higher than what most low-power devices can afford. To address these limitations, this paper proposes a method to train object detection networks with low-precision weights and activations. The probability density functions of weights and activations of each layer are first directly estimated using piecewise Gaussian models. Then, the optimal quantization intervals and step sizes for each convolution layer are adaptively determined according to the distribution of weights and activations. As the most computationally expensive convolutions can be replaced by effective fixed point operations, the proposed method can drastically reduce computation complexity and memory footprint. Performing on the tiny you only look once (YOLO) and YOLO architectures, the proposed method achieves comparable accuracy to their 32-bit counterparts. As an illustration, the proposed 4-bit and 8-bit quantized versions of the YOLO model achieve a mean average precision of 62.6% and 63.9%, respectively, on the Pascal visual object classes 2012 test dataset. The mAP of the 32-bit full-precision baseline model is 64.0%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.