Abstract

Recent years have witnessed significant advances in deep learning based object detection. Despite being extensively explored, most existing detectors are designed to detect objects with relatively low-quality prediction of locations, i.e., they are often trained with the threshold of Intersection over Union (IoU) set as 0.5. This can yield low-quality or even noisy detections. Designing high quality object detectors which have a more precise localization (e.g. IoU > 0.5) remains an open challenge. In this paper, we propose a novel single-shot detection framework called Bidirectional Pyramid Networks (BPN) for high-quality object detection. It comprises two novel components: (i) Bidirectional Feature Pyramid structure and Anchor Refinement (AR). The bidirectional feature pyramid structure aims to use semantic-rich deep layer features to enhance the quality of the shallow layer features, and simultaneously use the spatially-rich shallow layer features to enhance the quality of deep layer features, leading to a stronger representation of both small and large objects for high quality detection. Our anchor refinement scheme gradually refines the quality of pre-designed anchors by learning multi-level regressors, giving more precise localization predictions. We performed extensive experiments on both PASCAL VOC and MSCOCO datasets, and achieved the best performance among all single-shot detectors. The performance was especially superior in the regime of high-quality detection.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call