Abstract

Object proposal serves as a crucial pre-task of many image and video understanding applications. However, modern approaches for object proposal are typically based on closed-world assumptions, focusing only on pre-defined categories. This approach cannot meet the diverse needs of real-world applications. To address this limitation, we introduce two strategies, namely the eliminating strategy and the mining strategy, to robustly train the Object Localization Network (OLN) for open-world object proposal. The eliminating strategy takes into account the spatial configuration between labeled boxes, thereby eliminating box anchors that overlap with multiple objects. The mining strategy employs a pseudo-label guided self-training scheme, enabling the mining of object boxes in novel categories. Without bells and whistles, our proposed method outperforms previous state-of-the-art methods on large-scale benchmarks, including COCO, Objects365, and UVO. The source codes are available at https://github.com/hustvl/EM-OLN.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call