Abstract

Feature pyramids have become an essential component in most modern object detectors, such as Mask RCNN, YOLOv3, RetinaNet. In these detectors, the pyramidal feature representations are commonly used which represent an image with multi-scale feature layers. However, the detectors can’t be used in many real world applications which require real time performance under a computationally limited circumstance. In the paper, we study network architecture in YOLOv3 and modify the classical backbone--darknet53 of YOLOv3 by using a group of convolutions and dilated convolutions (DC). Then, a novel one-stage object detection network framework called DC-YOLOv3 is proposed. A lot of experiments on the Pascal 2017 benchmark prove the effectiveness of our framework. The results illustrate that DC-YOLOv3 achieves comparable results with YOLOv3 while being about 1.32× faster in training time and 1.38× faster in inference time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.