Abstract

In the majority of object detection frameworks, the confidence of instance classification is used as the quality criterion of predicted bounding boxes, like the confidence-based ranking in non-maximum suppression (NMS). However, the quality of bounding boxes, indicating the spatial relations, is not only correlated with the classification scores. Compared with the region proposal network (RPN) based detectors, single-shot object detectors suffer the box quality as there is a lack of pre-selection of box proposals. In this paper, we aim at single-shot object detectors and propose a location-aware anchor-based reasoning (LAAR) for the bounding boxes. LAAR takes both the location and classification confidences into consideration for the quality evaluation of bounding boxes. We introduce a novel network block to learn the relative location between the anchors and the ground truths, denoted as a localization score, which acts as a location reference during the inference stage. The proposed localization score leads to an independent regression branch and calibrates the bounding box quality by scoring the predicted localization score so that the best-qualified bounding boxes can be picked up in NMS. Experiments on MS COCO and PASCAL VOC benchmarks demonstrate that the proposed location-aware framework enhances the performances of current anchor-based single-shot object detection frameworks and yields consistent and robust detection results.

Highlights

  • Deep networks have been dramatically driving the progress of computer vision, bringing out a series of popular models for different vision tasks [31], [35], like image classification [3], [29], object detection [15], [32], crowd counting [25], depth estimation [10], and image translation [30]

  • We aim at single-shot object detectors that yield a better trade-off between accuracy and speed, The associate editor coordinating the review of this manuscript and approving it for publication was Zhenbao Liu

  • ABLATION STUDIES We evaluate the contribution of one important element to our location-aware box reasoning for object detection, the constraint brought by the Localization Score Regression

Read more

Summary

INTRODUCTION

Deep networks have been dramatically driving the progress of computer vision, bringing out a series of popular models for different vision tasks [31], [35], like image classification [3], [29], object detection [15], [32], crowd counting [25], depth estimation [10], and image translation [30]. We build an independent regression branch in the single-shot object detection framework that learns the location confidence and merges this information into the box quality evaluation metric of NMS so as to obtain a more reliable priority ranking. It is important that a detector can determine when the detection results are trust-worthy and when they are not This motivates us to integrate the localization score by location-aware anchor-based reasoning for every predicted bounding box based on an anchor position. In anchor-based detectors, the introduction of location awareness supplements the evaluation towards the quality of the bounding boxes from the perspective of location accuracy It is realized by learning the IoU between the anchors and ground truths, producing the localization scores. S(aj) should work well on two tasks: indicating the right category that the box belongs to and regressing the IoU of the proposals and the foreground objects

LOCALIZATION SCORE REGRESSION
EXPERIMENTS
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.