Abstract

Currently, object detectors based on CNN, such as RetinaNet, Faster-RCNN, CornerNet series, can achieve good performance, but have some common drawbacks, like large calculation cost, high model complexity and slow detection speed. In this paper, a new lightweight object detector is proposed, which adopted a density-based approach to merge the real boxes. To reduce calculation cost and improve detection speed, the tactic of multi-scale output is adopted to predict objects of different sizes with features of different scales. Furthermore, a new lightweight network model is proposed, which can show better performance in computation, FPS, and model complexity. Meanwhile, the separation of convolution is used to improve the basic convolution layer, which can achieve better results under the same number of filters. In the experiments, we verified the capability of our methods based on ablation experiment and model evaluation, which demonstrates the superiority of our method. Moreover, we have also conducted deep network and multichannel experiments on MS-COCO2014 datasets and achieved 20.9% mAP performance.

Highlights

  • Object detection is one of the three fundamental problems of computer vision, which has important applications in automatic driving [1]–[3], image/video [4], [5] retrieval, video monitoring [6], [7] and other fields

  • (1) Recent studies had shown that components added or improved in the field of object detection cannot bring about substantial changes

  • Based on the above anlysis, a new anchor generation algorithm is proposed, which can generate prior boxes more close to its own application scene according to the features of its own dataset, as a substitute for the current scene based on anchor detector with manual settings or k-means

Read more

Summary

INTRODUCTION

Object detection is one of the three fundamental problems of computer vision, which has important applications in automatic driving [1]–[3], image/video [4], [5] retrieval, video monitoring [6], [7] and other fields. In the one-stage detector YOLO [12]–[14], the improved k-means [15], [16] algorithms are adopted to merge the true box in the dataset through the calculation of IOU and Distance-IoU (DIoU) [52], and generate several groups of boxes, which are of different scales. Based on the above anlysis, a new anchor generation algorithm is proposed, which can generate prior boxes more close to its own application scene according to the features of its own dataset, as a substitute for the current scene based on anchor detector with manual settings or k-means. 1) A new generation algorithm of prior boxes is proposed, which adopted a density-based approach to merge [21] the real boxes in the dataset, with the purpose of obtaining the optimal length and width of the boxes and reducing the complexity of subsequent calculation.

RELATED WORK
OUR METHOD
KERNEL SIZE CHOICE
FOCAL LOSS FUNCTION
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.