Abstract
In the field of object detection, deep learning has greatly improved accuracy compared to previous algorithms and has been used widely in recent years. However, object detection using deep learning requires many hardware (HW) resources due to the huge computations for high performance, making it very difficult to run real-time on embedded platforms. Therefore, various compression methods have been studied to solve this problem. In particular, quantization methods greatly reduce the computational burden of deep learning by reducing the number of bits used for weights and activation functions in deep learning. However, most of these existing studies targeted only object classification and cannot be applied to object detection. Furthermore, most of the existing quantization studies are based on floating-point operations, which requires additional effort when implementing HW accelerators. This paper proposes an HW-friendly fixed-point-based quantization method that can also be applied to object detection. In the proposed method, the center of the weight distribution is adjusted to zero by subtracting the mean of weight parameters before quantization, and the retraining process is iteratively applied to minimize the accuracy drop caused by quantization. Furthermore, while applying the proposed method to object detection, performance degradation is minimized by considering the minimum and maximum values of weight parameters of deep learning networks. When applying the proposed quantization method to representative one-stage object detectors, You Only Look Once v3 and v4 (YOLOv3 and YOLOv4), detection accuracy similar to the original networks (i.e., YOLOv3 and YOLOv4) with a single-precision floating-point format (32-bit) is maintained despite expressing weights with only about 20% of the bits compared to a single-precision floating-point format in COCO dataset.
Highlights
In recent years, with the development of the Graphics Processing Unit (GPU), there has been tremendous development in the field of deep neural networks (DNNs)
This paper proposes a fixed-point-based quantization method appropriate for HW implementation (e.g., SoC design and field programmable gate array (FPGA) implementation) that can be applied to object detection to overcome the shortcomings of existing studies
To verify the effect of the proposed techniques on object detection, experiments are conducted using the COCO dataset in the YOLOv3 and YOLOv4 networks
Summary
With the development of the Graphics Processing Unit (GPU), there has been tremendous development in the field of deep neural networks (DNNs). Because convolutional neural networks (CNNs) are used in the field of computer vision, the accuracy of object detection and classification increases dramatically [1]–[4]. A. RELATED WORKS Several studies applying quantization to CNNs have been conducted because of their advantages of robustness against performance degradation, ease of applying the concept of approximate computing, and lack of needing to change the network structure [21]–[26]. Most recent quantization studies [21]–[23] have focused on INT quantization with integer parameters and floatingpoint scale factors to prevent an accuracy decrease in networks that perform image classification. In an embedded platform or HW accelerator environment, it is much more efficient to change the weight or activation parameters only to a fixed-point format, and quantization with a fixedpoint format is still an important research topic concerning practicality and HW efficiency. As deep learning becomes more actively used in mobile application, the importance of practicality is expected to increase
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.